THE AMERICAN 


MATHEMATICAL MONTHLY 


(FOUNDED IN 1894 By BENJAMIN F. FINKEL) 
THE OFFICIAL JOURNAL OF 


THE MATHEMATICAL ASSOCIATION OF AMERICA 


JANUARY 


VOLUME 80 NUMBER 1 
CODEN: AMMYAE 
CONTENTS 
Unique Factorization Domains . . . . . . . . . P.M. Conn 1 
Continuous Analogues of Series. . . . . R 'P. Boas, JR. AND H. POLLARD 18 
England was lost on the Playing Fields of Eton: A Parable for Mathematics . 
. A. B. WILLCOx 25 
MATHEMATICAL NOTES 
An Identity Satisfied by Derivations of a Purely Inseparable Field. .F.P. CALLAHAN 40 
On Sums of Powers of a Number . ... . . . VLADIMIR DROBOT 42 
_.A Local Mean Value Theorem for Analytic Functions . . .  AKE SAMUELSSON 45 
A Theorem on Set Inclusion in Metric Spaces J.A.HEINEN AND ALBERT WILANSKY 46 
Circle Groups of Nilpotent Rings . . . . . .J.C. AULTANDJ. F. WATTERS 48 
RESEARCH PROBLEMS 
Crossing Number Problems . . . . . . . .  P. Erpdés AND R. K. Guy 52 
‘CLASSROOM NOTES 
A Proof of Uniqueness of Factorization in the Gaussian Integers. 
; . M. F. RUCHTE AND R. W. RYDEN 58 
Some Half-plane Dirichlet Problems: A Bare Hands Approach. _F. J. FLANIGAN 59 
Single Layer Potentia's and the Cauchy-Kowalewski Theorem. . P. A. NICKEL 61 
A Global Characterization of Uniform Continuity . . . RICHARD CLEVELAND 64 
MATHEMATICAL EDUCATION 
Applied Mathematics at M.I.T.. . . . . . . +. +. +». H. P. GREENSPAN 67 
A Letter by Professor Polya . Dee ee 73 
ELEMENTARY PROBLEMS AND SOLUTIONS . 74 
ADVANCED PROBLEMS AND SOLUTIONS 82 
(Continued on inside cover) 
1973 


88 


REVIEWS . oe 

News AND NOTICES. . . . .. . ee 

MATHEMATICAL ASSOCIATION OF AMERICA . . . . . . . se eet est:é‘«C:«zCd'YF: 
May Meeting of the Indiana Section . . . . . . . . . .heheheeC«d‘CL'S 
Employment Information for Mathematicians. . . . . . . . . . . . &JIdS 
Calendars of Future Meetings . . . ....... . . . . . . <J16 


NOTICE TO AUTHORS 


Specialized research is usually unsuitable; see Statement of Policy (vol. 76, p. 2). Manuscript preparation: Please 
use the Manual for Monthly Authors (vol. 78, p. 1) and follow the format in current issues of the MONTHLY. 
Manuscripts should be typewritten, triple-spaced with wide margins; submit two copies and keep one for 


protection against loss. 
Backlog: Main Articles 12 months, Math. Notes 15 months, Research Problems 7 months, Classroom Notes 


11 months, Math. Education 10 months. 


EDITORIAL CORRESPONDENCE AND MAIN ARTICLES: to HARLEY FLANDERS, American Mathe- 
matical Monthly, Tel Aviv University, Ramat Aviv, Israe! (see Notice, vol. 77, 1970, p. 555); NOTES, etc.: 
to the corresponding Associate Editor; ADVERTISING CORRESPONDENCE: to RAouL HAILPERN, 
Mathematical Association of America, SUNY at Buffalo, Buffalo, N. Y. 14214; CHANGE OF ADDRESS 
and SUBSCRIPTIONS: to A. B. WILLcox, Mathematical Association of America, 1225 Connecticut Ave., 
N.W., Washington, D. C. 20036. 


HARLEY FLANDERS, Editor 
ASSOCIATE EDITORS 


JOSHUA BARLAZ J. G. HARVEY SEYMOUR SCHUSTER 

E. R. BERLEKAMP ERIC S. LANGFORD J. ARTHUR SEEBACH, Jr. 
JANE W. DI PAOLA P. D. LAX E. P. STARKE 

ROBERT GILMER ARTHUR MATTUCK LYNN A. STEEN 
RICHARD GUY M. W. POWNALL JAMES WENDEL 

RAOUL HAILPERN GIAN-CARLO ROTA 


Annual dues for members of the Association (including a subscription to the American 
Mathematical Monthly) are $12.50. For nonmembers the subscription price is $18.00. 


PUBLISHED BY THE ASSOCIATION at Washington, D. C., and Menasha, Wisconsin, during the months of January, 
February, March, April, May, June—July, August-September, October, November, December. 


Second-class postage paid at Washington, D. C., and additional mailing offices. 
Copyright © The Mathematical Association of America (Incorporated), 1973 


PRINTED IN THE UNITED STATES OF AMERICA 


UNIQUE FACTORIZATION DOMAINS 
P. M. COHN, Bedford College, University of London 


1. Introduction. One of the most fascinating developments in ring theory inrecent 
years is the way in which large parts of algebraic geometry can now be stated entirely 
in terms of commutative Noetherian rings [23]. In the other direction this has led 
to new tools for classifying and investigating these rings; furthermore, these appli- 
cations are no longer confined to Noetherian rings, and although commutativity 
is assumed as a rule, one suspects that even this is not always essential. There is an 
extensive and rapidly growing literature on the subject, and it would be difficult 
to do justice to it in a brief article. Instead, we have singled out a special class of 
rings: unique factorization domains. They provide a good example of how ring- 
theoretical properties can be illustrated by geometrical ideas. The non-commutative 
case is described separately; this is less well developed, and the connection with 
geometry is less clear, but eventually any geometric ring theory must also comprehend 
the non-commutative case. 

We begin with the definition of a commutative unique factorization domain 
(UFD) in section 2, and its relation to such basic notions as Dedekind and Krull 
domains. In section 3 we analyse the definition and reduce it to a statement about 
primes. Nagata’s theorem and its application to Felix Klein’s theorem on line 
complexes is discussed in section 4. 

The remainder of the article is concerned with the non-commutative case. In 
section 5 we discuss the lattice-method of defining and recognizing UFD’s and give 
some non-commutative examples. All are consequences of the fact that an atomic 
2-fir isa UFD. If we drop atomicity, we are left with the Schreier refinement property, 
which so far has mainly been studied in the commutative case (section 6). Section 7 
describes two special cases of interest in the non-commutative theory: the lattice of 
factors is distributive, respectively a chain. In section 8 we examine the shortcomings 
of the definition of non-commutative UFD and describe some remedies that have been 
proposed, and section 9 notes the problems of factorizing zero-divisors. 

Throughout, some of the easier proofs have been sketched and others omitted, 
with a reference where full proofs can be found. When no convenient reference 
was available, proofs are given in more detail. The article is based on a lecture 
delivered at the British Mathematical Colloquium at Leicester on April 1, 1964. 


—— 


Professor Cohn did his Cambridge Ph. D. under Philip Hall, and he has held positions at the 
Univ. de Nancy, Manchester University, Queen Mary College London, and (presently) Bedford 
College, London. He has spent leaves-of-absence at Yale Univ., Univ. of Chicago, and Rutgers. He 
has published extensively in universal algebra and in many branches of algebra; in 1965-67 he was 
the Secretary of the London Mathematical Society. His Books are Lie Groups (Cambridge Univ. 
Press 1957), Linear Equations (Routledge and Kegan Paul 1958), Solid Geometry (Routledge and 
Kegan Paul 1961), Universal Algebra (Harper and Row 1965), and Free Rings and Their Relations, 
LMS Monograph 2 (Academic Press 1971). He received a MAA Lester R. Ford Award in 1972. 
Editor. 


2 P. M. COHN [January 


2. Commutative unique factorization domains. All rings are understood to be 
associative, with a unit-element 1, which is inherited by subrings and preserved 
by homomorphisms. Moreover, all modules are unital. Usually our ring R will be 
an integral domain, i.e., the set R* of non-zero elements is non-empty and closed 
under multiplication. This terminology will be used even for non-commutative rings 
(though at first our rings will be commutative). An element u of a ring R is called a 
unit, if there exists ve R such that uv = vu = 1; any non-unit which cannot be 
written as a product of two non-units is said to be irreducible or an atom. An integral 
domain is said to be atomic if every element, not zero or a unit, is a product of atoms. 


DEFINITION. A commutative integral domain R is said to be factorial or a unique 
factorization domain (UFD) if it satisfies the following conditions: 

A. R is atomic, 

U. Any two factorizations of an element into atoms differ only in the order 
of the factors, and by unit factors. 


Thus if c = a,---a, = b,-:-b,, where a;, b; are atoms, then r = s and after 
a suitable renumbering of the b’s, a, is associated to b;, i.e., a; = bju;, where u,; 
is a unit. 

The best-known example of a UFD is the ring Z of integers. There are two 
basic ways of proving that Z is a UFD, which we shall call the prime method and 
the lattice method. Both are capable of generalization and we shall deal with each 
in turn (sections 3 and 5). 

UFD’s are important for several reasons: In the first place, their characteristic 
property makes them more amenable; secondly, to impose unique factorization 
often singles out a significant class of rings, while thirdly, the methods used to prove 
factoriality have often given rise to other notions important in their own right. 

The unique factorization property of the integers can also be shown to hold 
for the ring of Gaussian integers a + by/ —1 (a,beéZ), and this led to efforts to 
prove the same for the ring of integers in any finite algebraic extension of Z. These 
efforts, though doomed to failure, led Kummer to introduce ‘ideal numbers’ in an 
attempt to restore unique factorization. Dedekind [18] gave a general definition 
of ideals and showed that rings of algebraic integers possess unique factorization 
for their ideals. Thus if p; (i¢ J) are the different prime ideals, any non-zero ideal q 
has a unique representation 


(1) a = [[p7', 


where the integers v, = v,(a) are non-negative and all but a finite number of them 
are zero. A commutative integral domain with unique factorization of ideals is called 
a Dedekind domain; such a ring is necessarily Noetherian, i.e., it satisfies the as- 
cending chain condition, briefly ACC, for ideals. Any Noetherian UFD is a Dede- 
kind domain, but there are UFD’s that are not Noetherian, and hence not Dedekind, 
e.g., the polynomial ring in infinitely many indeterminates over a field. 


1973] UNIQUE FACTORIZATION DOMAINS 3 


To get a common generalization of UFD and Dedekind domain, we observe 
that in both cases we have a family of integer-valued functions on R* satisfying the 
familiar conditions for an exponential valuation: 

V.1. v(a) 2 O, 

V.2. v(a — b) 2 min{v(a), v(b)}, 

V.3. v(ab) = via) + v(b). 

Such a valuation extends in a unique way to the field of fractions K of R. Now 
Krull [30] considered more generally, integral domains with a family (v,);<,; of 
such valuations, where for any ce K*, v,(c) = O for almost all i, and ce R if and 
only if v,(c) 2 0. Such rings are called Krull domains (Krull called them ‘‘endliche 
diskrete Hauptordnungen’’); clearly they include both UFD’s and Dedekind do- 
mains as special cases, e.g., a Noetherian integral domain is a Krull domain if and 
only if it is integrally closed (in its field of fractions). Krull domains retain at least 
some of the useful features of UFD’s; moreover, unlike UFD’s, the class of Krull 
domains is closed under integrally closed integral algebraic extensions |6]. For any 
Krull domain its departure from factoriality is measured by the divisor class group, 
i.e. the group of all divisors (formal products | ][p;‘) modulo the principal divisors 
[36]. Its vanishing characterizes UFD’s; it is unchanged under adjunction of in- 
determinates, but may change under algebraic extension. 

The relation between the different types of ring becomes clearer if we adopt 
a slightly different point of view. Let K be any commutative integral domain and K 
its field of fractions. On K* we define the relation of divisibility: a divides b, in 
symbols: a | b, if ba~ ‘eR. Clearly this relation is reflexive and transitive, i.e., 
it is a preordering of K*. Moreover, it is compatible with multiplication: if a | b, 
then ac | be for all ce K*. In this way K* becomes a preordered group; if U is the 
group of units in R, then D = K*/U is the partially ordered group associated with 
the preordered group K*; it is called the divisibility group of R. Now various 
classes of rings can be described entirely in terms of the order type of their divisibility 
group. For comparison we shall need 'Z, the direct sum of card (J) copies of Z, 
with the componentwise ordering: (x;) S (y;) if and only if x; S$ vy; for all ie J. 
Then 


(i) R is a UFD if and only if D is order-isomorphic to’ Z, for some I, 

(ii) if R is a Krull domain then D is order-isomorphic to a subgroup of ‘Z, 
for some I, 

(iii) R is a valuation ring if and only if D is totally ordered, 

(iv) R is a discrete valuation ring if and only if D = Z.’ 


3. The relation of UFD’s to primes. Let us analyse the notion of UFD more 
closely. An element p of a commutative integral domain R is said to be prime if p 
is not zero or a unit and P| ab implies p | a or p| b. From this definition it is _ 
easy to see that each prime must be an atom. The converse need not hold; in fact 
with a finiteness condition it is equivalent to unique factorization: 


4 P. M. COHN [January 


THEOREM 1. A commutative ring is a UFD if and only if it is an atomic integral 
domain and every atom is prime. 


This is a sort of localization of the condition U. It is easily proved; in fact the 
usual proof that Z is a UFD consists in verifying that every atom is prime, using 
the Euclidean algorithm, and then carrying out what is in effect a proof of Theorem 
1 [42]. 

Examples of atomic integral domains that are not UFD’s are well known, e.g., 
the integers in the field Q(./—5). To give an example of a different kind, take R 
to be the ring generated by X9,x,,X ,x3 over a field k, with the defining relation 


(2) XX, = X2X3. 


Here x, | X5X3, but x, 4 x,, x; ¥X3, SO X, iS not prime, but it is clearly an atom. 

Regarding atomicity, this is a finiteness condition which clearly holds in every 
Noetherian domain. More generally, it holds in every integral domain with ascend- 
ing chain condition on principal ideals, ACC, for short. Conversely, every UFD 
satisfies ACC, , but there are atomic domains not possessing ACC, (notwithstanding 
an assertion to the contrary in Proposition 1.1 of [12]), e.g., the ring of all poly- 
nomials in x and y with rational coefficients, but where x’y* has an integral coeffi- 
cient whenever rs = 0. On the other hand, the Noetherian condition is not necessary 
in a UFD, as we have seen, but as a rule this is the important case for algebraic 
geometry. 

A less obvious factoriality criterion is obtained by using prime ideals. An ideal p 
in a ring R is said to be Prime if R/p is an integral domain. E.g., in an integral 
domain a principal ideal (p) is prime precisely when p is 0 or a prime element. Another 
way of describing a prime ideal is as an ideal whose complement is multiplicative, 
i.e., a nonempty multiplicatively closed set. 

We recall that given any subset S of a ring R, and any ideal a disjoint from S , 
the standard application of Zorn’s lemma produces an ideal p containing a and 
maximal subject to the condition pO S = @. Moreover, if S is multiplicative, 
p is prime, as is easily checked (and well known). Then we have the following char- 
acterization of UFD’s in terms of prime ideals [28]: 


THEOREM 2. An integral domain is a UFD if and only if every nonzero prime ideal 
contains a prime element. 


We recall the essence of the proof. If R is a UFD and p ¥ 0 a prime ideal, let 
p contain a = p;'-:- pr", where the p,; are primes, then p;ep for some i = 1,---,r. 
Conversely, assume the condition and let S be the set of all products of primes. 
Then S is multiplicative and moreover, it is saturated, i.e., any factor of an element 
of S is itself in S. If there is a non-unit c notin S, then (c) 1 S = @,so a maximal 
ideal p containing c and disjoint from S exists; it must be prime and by hypothesis 
contains a prime element, which contradicts the fact that all these elements lie 


1973} UNIQUE FACTORIZATION DOMAINS 5 


in S. Now the usual proof of Theorem | shows that a product of primes is neces- 
sarily unique. 
In particular we have the 


COROLLARY 1. Jn a UFD, every minimal non-zero prime ideal is principal. 


In a Noetherian domain the converse holds, i.e., a Noetherian domain in which 
every minimal non-zero prime ideal is principal isa UFD; this follows from Theorem 
2, because in this case every non-zero prime ideal contains a minimal non-zero 
prime ideal. But this is a non-trivial result, the consequence of Krull’s ‘principal 
ideal theorem’ (cf. [28], where the latter is described as ‘‘probably the most im- 
portant single theorem in the theory of Noetherian rings’’). 

To give a geometrical illustration, consider a twisted cubic. This cannot be 
obtained as the complete intersection of two surfaces in 3-space, for if the surfaces 
had degrees m and n, then mn = 3 and som or nis | and the cubic would be plane. 
In fact, as is well known, a twisted cubic can be obtained as the intersection of three 
suitable quadrics, or as the intersection of two quadrics with a common generator, 
if we ignore the generator. 

Geometrically, any non-degenerate quadric in complex projective 3-space can 
be brought to the form xox, = x,x,. The ring of functions on this quadric is the 
ring A = C[X_,X1,X2,X3]/(/), where f = X)X, — X,X;. This ring is an integral 
domain, but not a UFD, as we saw earlier. Now a minimal non-zero prime ideal 
of A corresponds to a maximal proper subvariety of the quadric, i.e., a curve, and 
by Th. 2, we cannot always expect this to be given by a single equation; the twisted 
cubic is a case in point. (This is a slight oversimplification, because subvarieties are 
actually defined by homogeneous ideals in this case.) 

Generally, if k is an algebraically closed field, an algebraic set over k is given 
by the zeros in k” of a set of polynomials in X,,---,X, and the coordinate.ring 
of this algebraic set is 

A = k[X,,---,X, |/a, 


where a is the ideal generated by the given set of polynomials. We have a variety 
(= irreducible algebraic set) if and only if a can be taken to be a prime ideal, and 
then A is an integral domain. If we take for granted the fact that every maximal 
subvariety of an n-dimensional variety has codimension 1 (i.e., is (n — 1)-dimensional, 
cf. [32], p. 36), Th. 2, Cor. 1 gives the necessity of the next result; the sufficiency 
follows by the remark following Th. 2, Cor. 1, because the coordinate ring of a 
variety is Noetherian. 


COROLLARY 2. The coordinate ring of a variety is a UFD if and only if every sub- 
variety of codimension 1 determines a principal ideal (thus the subvariety is a 
complete intersection). 


If V is any variety, the set 0, of functions defined at a given point x of V is a 


6 P. M. COHN [January 


local ring, i.e., a ring whose non-units form an ideal (the ideal of functions vanishing 
at x). If x isa simple point of V, p, is what is called a regular local ring, and this 
is necessarily a UFD. There are several proofs of this fact; for a thorough analysis 
of the algebraic background, see [28], and for a history of the problem, see [35]. 


4. Nagata’s Theorem. How does one prove that a givenring is a UFD ?In the case 
of Z we needed the Euclidean algorithm ([19], Book VII, Prop. 1-2) to prove that 
every atom is prime. For polynomial rings in one variable over a field one can use 
the same method (introduced by Stevin [40] to find the greatest common divisor 
of two polynomials), but for more than one variable this method is no longer avail- 
able (in the next section we shall see why). However, it is still true that a polynomial 
ring in any number of variables over a field is a UFD. More generally, if R is a 
UFD, then so is R[X]; the proof depends on forming rings of fractions. 

Let R be an integral domain with field of fractions K. Given a multiplicative 
subset S of R*, write 


Rs = {a/s| aeR,seS}. 


This is again a ring, in fact a subring of K, e.g., Re» = K.Any prime p in R either 
becomes a unit in Rg or it remains prime, depending on whether or not p divides 
an element of S. Moreover, any atom in R, comes from an atom in R. Thus by 
Th. 1 we obtain: 


THEOREM 3. Jf Risa UFD and S any multiplicative subset of R*, then Rg is also 
a UFD. 


Conversely, we have the following result, first proved by Nagata [34] (for 
Noetherian domains): 


THEOREM 4. Let R be an atomic integral domain and S a multiplicative subset of 
R consisting of products of primes. Then if R, is a UFD, so is R. 


The proof consists roughly in this: every atom of R either divides an atom 
of S and is then shown to be prime (using the fact that S consists of prime products), 
or if it divides no element of S it stays an atom in R, and is then prime because Ry 
is a UFD (cf. [17], p. 116). 

With the help of Th. 4 it is very easy to show that factoriality is preserved by 
adjunction of indeterminates. 


COROLLARY 1. Jf Ris a UFD, then so is the polynomial ring R [X]. 


For if K = Rp. is the field of fractions, then by the Euclidean algorithm, 
K[X] is a UFD. Now K[X] = R[X],. and R* consists of prime products, because 
R is a UFD, while Gauss’s lemma ensures that any prime of R is still a prime in 
R[X]. Further, R[-X] satisfies ACC,: one gets a bound on the number of non-unit 
factors by considering leading terms. Hence by Th. 4, R[X] is a UFD. 


1973] UNIQUE FACTORIZATION DOMAINS 7 


Cor. 1 shows (by induction) that the polynomial ring in any finite number of 
indeterminates over a field is a UFD. To extend the result to infinitely many indeter- 
minates one can proceed as follows. 

Let A, B be any rings, such that A is a subring of B. The inclusion A ¢ B is 
said to be inert if for any ce A such that c = ab, where a,be B, there exists a 
unit ue B such that au, u-'be A. E.g., any integral domain R is inert in the poly- 
nomial ring R[ X ]. Now let R bea ring which is a union of a directed system of sub- 
rings R, (i.e. any two subrings of the system are contained in a third). If all the R, 
are UFD’s and all the inclusions R, ¢ R, are inert, then R is a UFD. For any 
céR lies in some R, and so has a unique factorization into atoms in R,; moreover 
any factorization of c in a bigger ring R, can by inertia be pulled down to R, and 
so must agree with the factorization already found. 

We note that the inertia condition cannot be omitted: the semigroup algebra 
(over a field) of the additive semigroup of positive rational numbers is not a UFD, 
but it can be written as the union of a directed system of UFD’s. 

Now let 2 = (X;);., be an infinite family of indeterminates and (R,) the family 
of rings obtained by adjoining finitely many of the X’s to a given UFD R. The R, 
form a directed system of subrings of R[ 2] with inert inclusions, and each is a 
UFD (by (Cor. 1), hence R[ 2] is again a UFD. 

As a second application of Th. 4, due to Nagata [34, 37] we show that the co- 
ordinate ring of a quadric in more than three dimensions is a UFD. 


COROLLARY 2. Let k be an algebraically closed field of characteristic not two, and 
O(X,,°::,X,) a non-degenerate quadratic form in n2Z5 variables. Then 
[A = kX,,-++:,X,]/(Q) is a UFD. 


Proof. We can always write OQ = X,X,+Q0,(X3,°::,X,); here Q, is irreducible 
because n= 5. Writing x; for the residue class of X; (modQ), we have 
A = k[x,,--:,X,]. Let S be the multiplicative set generated by X, in k[ X,,---,X,,] 
and S’ the multiplicative set generated by x, in A, then 


As: = k[X2, +++, Xn] [x2°] = k[X2,°++,Xnds: = k[Xo,°°°, Xn ds. 


Since k[X.,---,X,] is a UFD, the ring on the right is a UFD by Th. 3. Now x, 
is prime in A because Q, is irreducible, so A is a UFD by Th. 4. 

Cor. 2 has an interesting geometrical consequence due to Klein. The lines in 
projective 3-space are described by Pliicker coordinates 7; (i,j = 0,-:-,3), subject 
to the relation (cf. [41], p. 22): 


(3) To 14T%23 + Mo2%31 + Mo3%12 = O. 


Each set of ratios (7,,;) satisfying (3) defines a line, so that the set of lines in 3-space 
may be viewed as a quadric (clearly non-degenerate) in projective 5-space, the 
Klein quadric. Thus the lines in 3-space form an algebraic variety; an algebraic 
subset of codimension 1 on this variety is called a line complex. Now by Th. 4, 


8 P. M. COHN [January 


Cor. 2, the coordinate ring of the Klein quadric is a UFD and so (by Th. 2, Cor. 2) 
its subvarieties of codimension 1 are complete intersections. Thus we get: 


KLEIN’s THEOREM. Every irreducible line complex in projective 3-space is given bya 
single equation in Pliicker coordinates (besides (3)). 


5. The lattice method. We now turn to the second method of studying UFD’s, the 
lattice method. This starts from quite a different definition of UFD, though of course 
equivalent to the one given earlier. It leads to other generalizations, and in parti- 
cular, it does not require the ring to be commutative. 

Thus let R be an integral domain (not necessarily commutative). If ce R* and 


(4) C= a,°4,, 

we consider the sequence of right ideals from R to cR: 
R2a,R2a,a,R2°>°' 2 a,:::a,R = cR; 

with it we associate the corresponding quotients 

(5) R/a,R,a,R/a,a,R = R/a,R,-:-, R/a,R. 

If we have a second factorization of c: 

(6) c= byob, 


with quotients R/b,R,---,R/b,.R, we say that (4) and (6) are isomorphic if r = s 
and there is a permutation iti’ of {1,---,r} such that R/a;R = R/b;,R. Now we 
define a (general) UFD as an atomic integral domain in which any two complete 
factorizations of a given element are isomorphic. 

This provides a definition of UFD for the non-commutative case. Although 
stated in terms of right modules, it turns out to be left-right symmetric. Further, 
it reduces to the previous definition in the commutative case; to see this we note 
that if in a commutative integral domain, R/aR = R/bR, then aR is the annihilator 
of R/aR and so aR = bR, from which it follows that a and b are associated. 

Let R be any ring; a right R-module M is said to be strictly cyclic if it can be 
written as R/cR, where c is a non-zero divisor. We denote by @, the category whose 
objects are all the strictly cyclic right R-modules, while the morphisms are all the 
homomorphisms between them. The category p@ of strictly cyclic left R-modules 
is defined correspondingly. Any homomorphism f/f: R/aR > R/bR in @ x is given 
by an equation 


ca = be’," 


and based on this fact one shows that there is a duality (1.e., a category anti-iso- 
morphism) between @p and ,@, for any ring R, [13, 17]. We shall call it the fac- 
torial duality in R; in particular this shows that 


1973] UNIQUE FACTORIZATION DOMAINS 9 
(7) R/aR = R/bR if and only if R/Ra & R/Rb, 


from which the left-right symmetry of the above notion of UFD follows immediately. 
Let us call two non-zerodivisors a, b of aring R similar if R/aR = R/bR [20, 24]. 
By (7) this notion is left-right symmetric; the corresponding notions for zero-divisors 
are distinct, as examples by Fitting [20] show. 

Earlier we saw that in a commutative ring two elements are similar precisely 
when they are associated; in general there is no such simple criterion. Some equiv- 
alent conditions are given in 


THEOREM 5. Let a,a‘ be two non-zerodivisors ina ring R. Then the following three 
conditions are equivalent: 

(i) a and a’ are similar, 

(ii) the matrices (5 °) and (5 °.) are associated in R,, 

(ili) there exist mutually inverse 2 x 2 matrices yw with a in the (1, 1)-position 
and uw" with a’ in the (2,2)-position. 


For a proof see [17], p. 124 f. ((i) <> (ii) was proved in [20] and (i) = (iil) 
in |9]). Here are some examples of non-commutative UFD’s. 

1. The ring of integral quaternions (a rational quaternion is said to be integral 
if its coefficients are integers or halves of odd integers). 

2. The ring of linear differential operators. Let k = C(t) be the field of rational 
functions in a single variable and D an indeterminate over k with the commutation 
rule 


(8) Df=fD+/', where f’ = df/dt. 


The skew polynomials XfD' (f,Ek) with multiplication according to (8) form 
a UFD. This, probably one of’the first examples, was established by Landau [31] 
and Loewy [33]. Two polynomials in D which are similar in the sense explained 
above define differential equations which are equivalent in the sense of Poincaré. 

3. Free associative algebras | 13, 17]. Every free associative algebra k<x,,---,X,)> 
over a field is a UFD; as an example of a non-trivial factorization in the free 
algebra k<x,y> we have 


XyX +X =x(Qyx4t1) =(xy4+ Dx, 


and as is easily seen, xy + 1 and yx + 1 are similar atoms. 

4. Group algebras of free groups [13, 17]. 

5° Free products of skew fields [13]. 

The definition given at the beginning of this section suggests a way of proving 
a ring to be a UFD: 


THEOREM 6. An integral domain R isa UFD whenever for each c € R* the set 
L(cR,R) of principal right ideals between R and cR forms a modular lattice of 
finite length of the lattice of all right ideals of R. 


10 P. M. COHN [January 


For then we can apply the Jordan-Hélder theorem for modular lattices [4, 10]. 
An example of a modular lattice is the lattice of all right ideals of a ring R; this is 
the lattice of all submodules of R regarded as a right R-module. The lattice of all 
submodules of any module is modular (hence the name). Therefore in a principal 
ideal domain (i.e., an integral domain in which every left or right ideal is principal) 
the principal right ideals between R and cR form a modular lattice; the ACC holds 
because every right ideal is finitely generated, while the DCC follows by the factorial 
duality, using the fact that R has ACC for left ideals. So we obtain 


CorROLLARY 1. Every principal ideal domain is a UFD. 


Both the integral quaternions and the ring of linear differential operators are 
principal ideal domains and therefore are UFD’s. The free associative algebra (on 
more than one free generator) is clearly not a principal ideal domain; so we look 
for weaker hypotheses from which to deduce Th. 6. Whenever the principal right 
ideals form a sublattice of the lattice of all right ideals, they form a modular lattice; 
this is so provided that for any a,beR there exist d,meR such that 


aR+bR = dR, aRQNbR = mR. 


The first equation leads to the Bezout identity au + bv = d for the greatest common 
divisor d of a and b, and these rings are called right Bezout domains. They are 
just the integral domains in which every finitely generated right ideal is principal, 
thus they are not much more general than principal right ideal domains. To get a 
wider class, let us look at free algebras. Any free associative algebra over a field 
has the following property [13]: 

Every right ideal is free as right R-module, of unique rank. 

A ring with this property is called a free right ideal ring or right fir for short. 
Left firs are defined similarly and a left and right fir is called a fir. E.g., free algebras 
(over a field), group algebras of free groups, and free products of skew fields are all 
firs. This is proved by the weak algorithm, a generalization of the Euclidean algo- 
rithm (to which it reduces in the commutative case). From this point of view the 
polynomial ring in one variable k[X] is just the free associative algebra on one 
generator. This explains why the Euclidean algorithm for polynomials in one variable 
does not extend to more variables: it only applies to free algebras. 

Any fir satisfies left and right ACC, [13,17]. Moreover, the mapping 
(x,y) + ax — by from R? to aR + bR defines an exact sequence 


(9) 0> aRNbR- R* > aR+bR-0, 


which necessarily splits (because aR + bR is free); hence aR+ DR and aRNbR 
are principal whenever aRN bR ~ 0. Thus all the conditions of Th. 6 hold and 
we find [13, 17]: 


COROLLARY 2. Any (left and right) fir is a UFD. 


1973] UNIQUE FACTORIZATION DOMAINS 11 


Let us define, for any n 2 1, an n-fir as a ring in which every right ideal on at 
most n generators is free, of unique rank. The notion so defined is left-right symmetric 
and for larger n we get smaller classes, until we get to semifirs, the rings that are n-firs 
for all n. The 1-firs are just the integral domains, and a 2-fir is a ring in which each 
2-generator right ideal is free, of unique rank. Looking more closely to see what 
was needed to prove Cor. 2, we obtain [9, 17]: 


CoROLLARY 3. Every atomic 2-fir isa UFD. 


This generalizes Cor. 2, because every fir is clearly an atomic 2-fir. In the com- 
mutative case, every atomic 2-fir is a principal ideal domain, so Cor. 2 and 3 tell 
us nothing new for commutative rings. But we have seen examples of non-principal 
firs (free algebras) and there are also atomic 2-firs that are not firs [3, 11]. For 
example, to obtain a ring R such R* can be embedded in a group but R cannot 
be embedded in a (skew) field (Malcev’s problem), Bowtell [7] constructs an atomic 
2-fir; this ring cannot be a fir because every fir is embeddable in a field [16]. We 
remark in passing that if R is any atomic 2-fir, then R* is embeddable in a group 
[16]; it is not known whether this property is shared by all 2-firs, or by all UFD’s. 

The problem of unique factorization has also been studied in rings with a set 
of defining relations of the form ab = cd, where a, b, c, d are atoms in the free 
algebra, by Bokut’ [5]. 


6. The Schreier refinement property. A look at the lattice method of defining 
UFD’s immediately suggests the generalization obtained by giving up atomicity. 
Let us define an S-ring (for Schreier) as an integral domain in which any two fac- 
torizations of any non-zero element have isomorphic refinements. 

For the moment let us return to the commutative case; in the presence of ACC,, 
S-rings reduce of course to UFD’s, but in general these classes are distinct. In fact 
there is an intermediate class, the HCF-rings: an HCF-ring is an integral domain 
in which any two elements have a highest common factor. In terms of the divisi- 
bility group D of a ring (section 2) we can say that R is an HCF-ring if and only 
if D is lattice-ordered, while R is an S-ring precisely if D has the (m,n)-interpola- 
tion property, for all m,n: given X4,°°,Xm Vises MnED, if x; Sy; (= 1,---,m, 
j = 1,-:-,n), then there exists ze D such that x; S z S j, (all i,j). This is actually 
a consequence of the (2,2)-interpolation property [4, 12]. 

Clearly we have the implications 


UFD = HCF-ring => S-ring, 


and neither of these arrows can be reversed [12|. Moreover, any HCF-ring is in- 
tegrally closed (in its field of fractions), but this need not be true of S-rings, as is 
shown by the following example, due (independently) to G. M. Bergman and M. 
Kneser (unpublished): 

Let F be a field with a proper algebraic extension E, and consider the ring of 


12 P. M. COHN [January 


all formal power series ag + > {a,x*', where a,eF, a,;E€E (i> 0) and (A) is a 
sequence of positive rational numbers tending to infinity. Then R is an S-ring, but 
not integrally closed. 

By an argument somewhat analogous to the proof of Th. 4 one shows that if 
R is an integrally closed S-ring, then so is R[ X] (cf. [12]). Here the hypothesis of 
integral closure cannot be omitted; in fact if R[ X] is an S-ring, it is easy to see 
that R must be an integrally closed S-ring. Thus it is more natural to confine atten- 
tion to integrally closed S-rings. These rings are studied in [12], where they are 
called Schreier rings. 

Turning to the non-commutative case, we note that every 2-fir is an S-ring. 
It is not difficult to give examples of non-atomic 2-firs: apart from the commutative 
examples there is the group algebra of a free product of copies of the additive group 
of rational numbers; this is a non-Ore semifir which is non-atomic. It seems more 
difficult to produce examples of non-Ore semifirs (or even 2-firs) that are atomless. 

The commutative case suggests that there should be a condition analogous to 
integral closure which plays a part in the study of general S-rings, but it is not clear 
what form this condition should take, or indeed, what its precise role would be. 

For other studies of infinite factorizations see [1, 3, 25, 27]. 


7. Special cases in the non-commutative theory. In a commutative UFD the 
principal ideals form a lattice; more generally this is so (by definition) in a commu- 
tative integral domain with highest common factors (HCF) and least common 
multiples (LCM). These are the HCF-rings we met in section 6; in fact it is enough 
to assume the existence of a HCF for each pair of elements, or equivalently assume 
the existence of a LCM for each pair [12]. Curiously enough, this symmetry dis- 
appears when we consider individual pairs: in an integral domain, any pair of ele- 
ments having an LCM also has an HCF, but the converse need not hold (consider 
the elements 2 and 2x in the subring of the polynomial ring Z[X] consisting of all 
polynomials with even coefficient of X). 

If R is an HCF-ring and K its field of fractions, then the principal fractional 
ideals in K form a group under multiplication, and this group structure is compatible 
with the ordering by inclusion. Thus we have a lattice-ordered group; it is well 
known that such a group is distributive as a lattice ([4], p. 292). In particular, in 
a Bezout domain R, for any ce R* the set L(cR,R) of principal ideals between 
R and cR is a distributive lattice. For non-commutative rings this need not be so, 
even in the case of 2-firs, where L(cR, R) is a sublattice of the lattice of all right 
ideals‘ Let us say that an integral domain R has a distributive factor lattice if for 
each ce R* the set L(cR, R) is a distributive sublattice of the lattice of all right 
ideals of R. By the factorial duality the notion so defined is left-right symmetric; 
moreover, any ring with a distributive factor lattice is a 2-fir. 

A principal ideal domain has a distributive factor lattice if and only if every 
(left or right) ideal is two-sided [17] and this is a fairly stringent requirement. It 


1973] UNIQUE FACTORIZATION DOMAINS 13 


is therefore of interest that among general 2-firs quite a wide class of rings have a 
distributive factor lattice, e.g., free associative algebras and group algebras of free 
groups. This follows from some technical results of G. M. Bergman proved in [3, 
17|. These results show more generally that a 2-fir defined over a field k, which 
remains a 2-fir under all field extensions, has a distributive factor lattice. 

In an atomic 2-fir with distributive factor lattice, the factors of a given element 
form a distributive lattice of finite length. These lattices have been described in 
terms of partially ordered sets [3, 4, 17]; to be precise, the categories of finite 
distributive lattices and homomorphisms, and finite partially ordered sets and order- 
preserving mappings are dual to each other via the functor Hom(—, 2), where 2 
is the 2-element lattice resp. ordered set. This description has been used by Bergman 
to study the possible factorizations that can occur. For example, in a commutative 
UFD, the only distributive lattices which can be realized in this way are direct 
products of chains; the corresponding partially ordered sets are disjoint unions of 
chains. However, in a free associative algebra, every finite distributive lattice can 
be realized as a lattice of factors in this way. The simplest case not occcurring in 
commutative rings is the partially ordered set _\,, with the corresponding lattice 

. It is the factor lattice of the element x(x + 1)y in the free algebra on x and y 
(cf. [3, 17]). 

We can specialize 2-firs still further by requiring the set L(cR, R) of principal 
right ideals to be a chain. Any element c with this property is said to be rigid, and 
an integral domain in which all non-zero elements are rigid is called a rigid domain. 
A commutative domain is rigid precisely if it is a valuation ring (by definition of 
the latter), and a rigid commutative UFD is a discrete valuation ring. 

Among non-commutative rings a typical example is the ring of formal power 
series in several non-commuting indeterminates over a field: k <x,,---,x,> [17]. 
More generally, an integral domain is rigid if and only if it is a 2-fir and a local ring. 
An element c in an atomic 2-fir is rigid whenever all the factors of c in a complete 
factorization generate a proper ideal [29, 17]. 

A right discrete valuation ring may be defined as an integral domain R with 
an atom p such that every non-zero right ideal has the form p"R and * p"R = 0. 
Then a rigid UFD is a right discrete valuation ring if and only if it contains a non- 
unit c such that cR meets every non-zero right ideal of R non-trivially [17]. 


8. Remarks on the definition of non-commutative UFD. In some respects the 
definition of non-commutative UFD given in section 5, is not entirely satisfactory: 
there is no analogue to Nagata’s theorem (Th. 4). Any reasonable analogue should 
enable one to prove that the free algebra Z<x,,---,x, > over the integers is a UFD, 
but this is not the case according the above definition, as the factorizations 


XYX + 2X = x(yx + 2) = (xy + 2)x 


show. They are complete factorizations of xyx + 2x, but xy +2 is not similar 
to yx +2. 


14 P. M. COHN [January 


In order to describe the various possibilities that can arise, let us assume that 
we have an equivalence relation q defined on the set of non-zerodivisors in each ring 
such that 


E. 1. If aqa’ and a is an atom, then so is a’, 
E, 2. In a commutative integral domain, aqa’ if and only if a is associated 
to a’. 


We shall call R a q-UFD if every element not zero or a unit has a complete 
factorization into atoms, and given any two such factorizations of the same 
element: 


we have r = s and there exists a permutation it i’ of {1,---,r} such that a; q b;. 
For example, the class of UFD’s defined in section 5 may now be described more 
accurately as similarity-UFD’s. As we remarked earlier, Z<x,y> is not a simi- 
larity-UFD, and we therefore try to find a wider equivalence g than similarity, 
for which this ring is a g7- UFD. A number of different choices for gq have been pro- 
posed; all are wider than similarity and in addition to E. 1-2 satisfy 


E. 3. In an atomic 2-fir, q reduces to similarity. 


Their usefulness depends on the ease with which q can be checked; apart from 
this, the main requirement is that the property of being a g-UFD should be reflected 
by taking rings of fractions (i.e., Nagata’s theorem). To describe this property we 
must first define primes in non-commutative rings. 

An element c of a ring R is said to be invariant if c is a non-zerodivisor such 
that cR = Rc. E.g., in a commutative integral domain every non-zero element is 
invariant. In general the condition on c just states that the left multiples of c are 
the same as the right multiples; for an invariant element c we can therefore write 
C | 6 without ambiguity to indicate that b is a multiple of c. Now a prime is defined 
as an invariant non-unit p in R, such that 


p | ab implies pia or p|b. 


Clearly any product of invariant elements is invariant; thus if S is a multiplicative 
set consisting of prime products, then every seS is invariant and hence, for each 
aeéR, there exists a’ € R satisfying as = sa’. This shows that the pair R, S satisfies 
the Ore right multiple condition, and so S is a right denominator set [15], which 
can be used to form the ring of fractions R,. We note that since S consists of non- 
zerodivisors, the natural homomorphism 4: R — Rg is injective, so we may take 
R to be embedded as subring in Rg. 

Suppose now that we have an equivalence q defined on each ring R, satisfying 
E. 1-3 and moreover, 


1973] UNIQUE FACTORIZATION DOMAINS 15 


E. 4. Let R bearing and Sa multiplicative subset consisting of prime products. 
If a,a' é€ R* are such that aqa’ in Rs, thenagqa’ in R. 

With the help of this condition it is possible to establish an analogue of Nagata’s 
theorem. Thus let R be an atomic integral domain and S a multiplicative set con- 
sisting of prime products, such that Rs isaq-UFD, where q is an equivalence relation 
satisfying E. 1-4. Then R is also a q-UFD; for given two atomic factorizations 


(10) C= a,::-a, = b,-b,, 


if one of a,,--:,a,,b,,-+*,b, is a prime, it is a left factor of c and so may be taken 
to be a, say. Thus a, | b,-++b, and by primeness, a, | b; for some i; since b; is an 
atom, it must be associated to a, and so J; is also prime. But then we can divide (10) 
by a, and use induction on r+ s to complete the proof that every element has a 
unique factorization into atoms. We may therefore assume that no a; or 5, is prime; 
then it follows that the a; and b, cannot divide any element of S. Going over to 
Ry we find that each a,, b, is still an atom in Rg. Since Rg is a g-UFD, r = s and 
there is a permutation it i’ of 1,---,r such that a;qb;, in Rs, and by E. 4, a;q5;- 
in R. Thus we have proved: 


THEOREM 7. Let q be an equivalence on rings satisfying E. 1-4. If Ris an atomic 
integral domain and S a multiplicative set consisting of prime products such 
that Rs is a q-UFD, then R is a g-UFD. 


What was said earlier shows that similarity is an equivalence relation satisfying 
E. 1-3 but not E. 4. Since we are looking for an equivalence q wider than similarity, 
two elements a, b defining isomorphic modules R/aR, R/bR will lie in the same 
q-class, so that q can be described in terms of the category @, of strictly cyclic 
modules. Brungs in [8] defines a preordering ‘<’ on R by putting a < b whenever 
there is an injective homomorphism R/aR — R/bR. The associated equivalence: 
‘a < b and b < a’ is called (right) subsimilarity. It satisfies E. 1-3 and in place 
of E. 4 satisfies an analogous condition (with a rather more complicated definition 
of prime), which enables one to prove, e.g., that any free algebra over Z is a sub- 
similarity-UFD. 

A still wider notion of equivalence was intraduced in [14]: We again define a 
preordering by putting a before b whenever there is a monomorphism R/aR > R/bR 
(i.e., a right cancellative map in @,); the associated equivalence is called right 
monosimilarity. This satisfies E. 1-3 and also E. 4 if we limit ourselves to multi- 
plicative sets S in the centre of the ring. In this way we reach the notion of a right 
monosimilarity-UFD; dually one can define right episimilarity-UFD’s, on replacing 
mono- by epimorphisms, but by the factorial duality this is the same as a left mono- 
similarity-UFD [14]. 

Still wider notions of equivalence are possible. Let us call two elements a, b of 
a ring R left coprime if they have no common left factor apart from units, and define 
right coprime similarly. A relation 


16 P. M. COHN [January 
(11) ab’ = ba’ 


is said to be coprime if a, b are left coprime and a’, b’ right coprime. It can be shown 
({17] p. 126) that in a 2-fir two elements a, a’ are similar ifand only if they can be put 
in a coprime relation (11). This shows that the relation between a and a’, expressed 
by a coprime equation (11) in a 2-fir, isan equivalence. In general rings this is not 
so, but we can construct an equivalence as follows. Let us say that a,a’ (in that 
order) are perspective if they can be put in a coprime relation (11); two elements 
a,a’ will be called projective if there is a chain ay = a, a,,°:-,a, =a’ such that 
for i = 1,---,n either a;_,,a; or a;,a;_, are perspective. Then projectivity is an 
equivalence, in fact it is the equivalence ‘generated’ by perspectivity, and by what 
has been said, it reduces to similarity in 2-firs. This relation has been studied by 
Beauregard [2] for certain classes of rings. Let us define a weak HCF-ring as an 
integral domain R such that for each ce R* the set L(cR, R) is a modular lattice, 
relative to the ordering by inclusion. By the factorial duality this notion is left-right 
symmetric, and in the commutative case, it reduces to the notion of HCF-ring 
considered in section 6. Beauregard [2] shows that every atomic weak HCF-ring 
is a projectivity-UFD. Now it is not hard to verify that projectivity is an equivalence 
satisfying E. 1-4, therefore Th. 7 applies in this case. 

To give an example of a ring which definitely falls outside all these definitions 
of UFD, let us take the Weyl algebra, i.e., the ring A generated by elements x, y 
with the defining relation xy — yx = 1 over a field k of characteristic not 2. This 
ring is a Noetherian domain and hence has a skew field of fractions. Let S be the 
set of all non-zero polynomials in y, then S is a right denominator set and the ring 
A, of fractions is a principal ideal domain and hence a (similarity-) UFD. However, 
we have the following factorizations in A: 


xXyxX +x = (xy+1)x = x’y. 


It is easily checked that x, y,xy + 1 are atoms, so not even the number of factors 
in a complete factorization is constant. 


9. Factorizing zerodivisors. So far we have confined ourselves almost entirely to 
integral domains. A very similar theory is possible for the factorization of non-zero- 
divisors in general rings, but this is of less interest and so has not received as much at- 
tention. For ‘full’ matrices over firs there is a fairly satisfactory theory (cf. [17], ch. 5, 
and, for an application, [16]). The corresponding theory for rectangular matrices still 
faces difficulties, in that not all complete factorizations of a given matrix have the 
same number of factors [17]. 

Finally there is the problem of factorizing zero-divisors. A definition of commu- 
tative unique factorization ring (with zero-divisors) has been given by Fletcher 
[21, 22], who shows that the unique factorization rings so defined are just the finite 
direct products of UFD’s and ‘special’ principal ideal rings (i.e., homomorphic 
images of discrete valuation rings). The main difficulty in factorizing zero-divisors 


1973] UNIQUE FACTORIZATION DOMAINS 17 


is that one cannot expect uniqueness; nevertheless legitimate questions can be asked, 
as is shown by the theorem on the diagonal reduction of matrices over a principal 
ideal domain, which may be regarded as the prototype of a unique factorization 
theorem for this case. 


References 


1. R. A. Beauregard, Infinite primes and unique factorization in a principal right ideal domain, 
Trans. Amer. Math. Soc., 141 (1969) 245-254. 

2. , Right LCM-domains, Proc. Amer. Math. Soc., 30 (1971) 1-7. 

3. G. M. Bergman, Commuting elements in free algebras and related topics in ring theory, 
Thesis, Harvard University, 1967. 

4, G. Birkhoff, Lattice theory, 3rd ed. (AMS, Providence 1967). 

5. L. A. Bokut’, Factorization theorems for certain classes of rings without zero-divisors (Rus- 
sian) I, Algebra i Logika 4, No. 4 (1965) 25-52; II. ibid. No. 5 (1965) 17-46. 

6. N. Bourbaki, Algébre commutative, ch. 7 (Hermann, Paris 1965). 

7. A. J. Bowtell, On a question of Malcev, J. Algebra, 6 (1967) 126-139. 

8. H. H. Brungs, Ringe mit eindeutiger Faktorzerlegung, J. Reine Angew. Math., 236 (1969) 
43-66. 

9. P. M. Cohn, Noncommutative unique factorization domains, Trans. Amer. Math. Soc., 109 
(1963) 313-331; correction 119 (1965) 552. 


10. , Universal algebra, Harper & Row, New York, London, Tokyo, 1965. 

11. , some remarks on the invariant basis property, Topology, 5 (1966) 215-228. 

12. , Bezout rings and their subrings, Proc. Cambridge Phil. Soc., 64 (1968) 251-264. 
13. , Free associative algebras, Bull. London Math. Soc., 1(1969) 1-39. 

14, , Factorization in general rings and strictly cyclic modules, J. Reine Angew. Math., 


239/40 (1970) 185-200. 


15. , Rings of fractions, this MONTHLY, 78 (1971) 596-615. 
16. , The embedding of firs in skew fields, Proc. London Math. Soc., (3) 23 (1971) 193-213. 
17. , Free Rings and their relations, Academic Press, London & New York, 1971. 


18. R. Dedekind, Uber die Theorie der ganzen algebraischen Zahlen, Vieweg, Braunschweig, 1964. 

19. Euclid, Elements (—300). 

20. H. Fitting, Uber den Zusammenhang zwischen dem Begriff der Gleichartigkeit zweier Ideale 
und dem Aquivalenzbegriff der Elementarteilertheorie, Math. Ann., 122 (1936) 572-582. 

21. C.R. Fletcher, Unique factorization rings, Proc. Cambridge Phil. Soc., 65 (1969) 579-583. 

22. , The structure of unique factorization rings, Proc. Cambridge Phil. Soc., 67 (1970) 
535-540. 

23. A. Grothendieck, Eléments de géométrie algébrique (PUF, Paris 1960). 

24. N. Jacobson, Theory of rings, AMS, Providence, 1943. 

25. A. V. Jategaonkar, A counter-example in homological algebra and ring theory, J. Algebra, 12 
(1969) 418-440. 

26. R. E. Johnson, Unique factorization in a principal right ideal domian, Proc. Amer. Math. 
Soc., 16 (1965) 526-528. 

27. , Unique factorization monoids and domains, Proc. Amer. Math. Soc., 28 (1971) 
397-404. 

28. I. Kaplansky, Commutative rings, Allyn & Bacon, Boston, 1970. 

29. E. G. KoSevoi, On the multiplicative semigroup of a class of rings without zero-divisors 
(Russian), Algebra i Logika 5, No. 5 (1966) 49-54. 

30. W. Krull, Idealtheorie, Ergeb. d. Math. vol. 4, 3, Springer, Berlin, 1935. 


18 R. P. BOAS AND H. POLLARD [January 


31. E. Landau, Ein Satz tiber die Zerlegung homogener linearer Differentialausdrticke in irreduz- 
ible Faktoren, J. Reine Angew. Math., 124 (1902) 115-120. 

32. S. Lang, Introduction to algebraic geometry, Interscience, New York, 1958. 

33. A. Loewy, Uber reduzible homogene Differentialausdriicke, Math. Ann., 56 (1903) 549-584. 

34. M. Nagata, A remark on the unique factorization theorem, J. Math. Soc. Japan, 9 (1957) 
143-145. 

35. , Local rings, Interscience, New York — London, 1962. 

36. P. Samuel, Sur les anneaux factoriels, Bull. Soc. Math. France, 89 (1961) 155-178. 


37. , Anneaux factoriels, Sao Paulo, 1963. 
38. , Lectures on unique factorization domains, TIFR, Bombay, 1964. 
39. , Unique factorization, this MONTHLY, 75 (1968) 945-952. 


40. S. Stevin, Arithmétique (1585, new ed. 1958). 
41. B. L. van der Waerden, Moderne algebraische Geometrie, Springer, Berlin, 1939. 
42. O. Zariski and P. Samuel, Commutative algebra I, Van Nostrand, Princeton, 1968. 


CONTINUOUS ANALOGUES OF SERIES 


R. P. BOAS, Jr., Northwestern University and 
H. POLLARD, Purdue University 


1. Introduction. The present note, which was inspired by [10], arose as an 
attempt to understand why some infinite series have continuous analogues whereas 
others do not. The methods of [10] are ad hoc and fail to reveal the underlying 
mechanism. 

The notion of continuous analogue is not easy to define precisely; however, 
given that [1], [6], [7] 


~ sin*(c+n)o  f[ sin*(c + x)a 
0) 2 7 I. (¢ +x) 


= 0 
ee ny dx = n/a, <a<n, 


where (sin* wu)/u? is taken as «? when u = 0, almost anyone would call the integral 
in (1) a continuous analogue of the series. A less transparent example is (series: see, 
for example, [5}, p. 102; integral: see any book on complex analysis) 


— sin(n — 4)o © sin “x 7 
(2 D m= | ———dx = —sgna, a| < 27, 
i. a psenc, [al 
where the analogy seems tlawed; but it is improved if we rewrite (2) as 
3) 5 sin (n — 2 _ | ° sin (x = 2 dx. 
n=—© n—% — 00 Xx — 7% 


Recently Pollard and Shisha [10] observed that although the binomial series 


1973] CONTINUOUS ANALOGUES OF SERIES 19 


(4) (l+e")* = E (Fy em |Ith<m, -m<t<nja>-1 
n=0 


does not have a continuous analogue — that is, we do not get a correct result by 
replacing n in (4) by x and replacing summation by integration — it does if we 
first extend the sum in (4) over ( — 00, ©). This does not change the series because 
the added terms are all zero, and it turns out that, in fact, 


(5) [ (jet = = ( Je" = ¢ a: eit) 


fora > —1and |t| <2, -—-n<t<z. 

Thus a series that does not have a continuous analogue may acquire one if we 
write a different but equivalent formula for the terms of the series. 

As another example, it originally puzzled us that the binomial series (4) has a 
continuous analogue whereas the equally natural 


(6) (1 —e)"*= (err en a<1,0<t<2nz 
n=0 


does not, even if we extend the sum over ( — 00, 00). 
If, however, we write (6) in the equivalent form 


(7) (d—ety* = > (* rn ‘) sinn(n +4) oineint QQ ep <n, 
n=0 n Sin 700 
it is true (as we shall see) that also 
(8) (1 _ el')~4 _ | (* +u— ) sin 7 (u + 0) ei!™ o uly. 
_ u sin mH 


Continuous analogues of series are of interest in physics (cf. [2]), where one often 
attempts to deal with an intractable sum by replacing it by the corresponding integral. 
In fact, the sum in (1) does arise in physics and was ‘‘approximated’’ by the integral 
before it was realized that the approximation is exact. (See [1], [6], [7].) 


2. A general formula. We shall need the notation, but only the simplest 
theorems, of the theory of Fourier transforms. (Everything that we use is in [11].) 
Our notation is 


ce 6) 


2e(x)e'"* dx, 


re @) 


@) a) =n) [ e™Guydu, Gu) = 2n)~* | 
and we shall use other letters in the same way. In all our work, G will be zero outside 
a certain finite interval, and we shall suppose that | G| is integrable. 

Suppose now that G(x) = 0 for x outside an interval (t —2z2,t+2z), with 
G(x) > 0 as x >t + 27 from inside the interval. Then when n is an integer, 


20 R. P. BOAS AND H. POLLARD [January 


t 


t+2r 
(2x)? g(n) = | e ™G(u)du + | e~'"G(u)du 


t-~2n 
tt+2n 

= | e '™SG(u — 2m) + G(u)}du. 
t 


The numbers on the right are 2x times the Fourier coefficients for the interval 
(t,t + 2x) of the function G(u — 2) + G(u). Let us suppose that G(u — 22) + G(u) 
satisfies, at t, a sufficient condition for the convergence of its Fourier series to its 
value at t (for example, it is enough to have the function of bounded variation ina 
neighborhood of t, its value at t being the average of its right-hand and left-hand 
limits). Then DX g(n)e'" is a Fourier series and converges to the value at t of the 
function that generates it; that is, 


re @) 


x g(nje™ = (2n)*{G(t — 2x) + G(t)} = (2n)*G(2). 


n=- 0 


If we replace G(t) by its value from (9), we find 


(10) DL g(nje™ = { a(x)e"*dx. 

In practice we usually have G(x) = 0 outside a shorter interval (r,s); then (10) 
holds for s—2x<t<r+2z. In particular, it holds for 0<t<2z when (r,s) 
= (0,272); and for —-a<t< 2 when(r,s) = (—17,7). 

Formula (10) is actually a special case of the Poisson summation formula ([11], 
p. 60; [13], vol. I, p. 68; 2; [8], p. 152), but we do not need the general formula. 


3. Examples. We can now produce examples of (10) by looking at functions 
that are known to have the form 


Ss 


(11) g(x) = (2x)? | e~™*Giu)duy  s—-r<4n 


(cf. [1], [4], [6], L7)). 


Let us first take G(u) = e ~*™“ on (—a,a) and G(u) = 0 for | | >a, where 
0<a<z. Then 


a 


— —4 —icu-ixu _ 4 sin (c+ x)o 
(12) g(x) = (2n) [¢ du = 22m) SC +9 


b 


with the convention that u7~? 


“a—-27<t<a+2z7, that is 


sin au = « when u =O. Hence we have (10) for 


© gin oO: . 
> sin(c + n)o pint | sin(c + x)a ally. 


n= = 00 c+n ~ 0 C+Xx 


1973] CONTINUOUS ANALOGUES OF SERIES 21 


provided that O0<a<2, —~2<t<2z. In particular, we can take t = 0 and then 
oO 
Dy dx = 7; 


sin(c+n)a  [% sin(c+x)a 
ep a or ns 


00 C+Xx 
since the integrand is an odd function the formula can be written in the form 


sin (c © sin(c 
| sin(C+ OW . _nsena, fal <z. 
n=—-0 c+n ~ 6 C+ Xx 


For c = 4 we have (2) and for c = 0 we have 


y — = [L nas dx = msgna, || <7; 

n=— 0 n —o x 

remembering that the term with n = 0 is to be interpreted as «, we have a symmetrical 
version of a familiar Fourier expansion. 

This discussion can be generalized. The product of two Fourier transforms g, 
and g, is again a Fourier transform; if G, and G, vanish outside (—«,a) and 
(— B,B), respectively, then g,g, is the transform of a function vanishing outside 
(—a— B,a-+ f) (actually the function for which g,g, 1s the transform is the con- 
volution of G, and G,, but we do not need to know this). Hence if 


a 


B 
o(x) = Qn)" | eG (udu and g,(x) = ny | e™ G (u)du, 
—6 


comme 4 


and « + B < 2z, then 


D g(n)ga(nem™ = | o1(x)ea(xe*dx 


provided thata + Bp—2n<t< —a—f+2rz. 
Taking g, as in (12) and t = 0, we have in particular 


~ sin(c+n)e  [” sin(c + x)a 
(13) Ex) ES go) RE a, 
If we further specialize by taking g, equal to g, , we get 
~ sinr(c+njxe [™ sinr(c+x)e, 2 
(14) _s (etn? = {. (c+x)? dx = aL ) 0<a<7; 


this is formula (1) of §1. 


4. Binomial series. With Pollard and Shisha, we start from the formula 


Oo _ 1 . —ixt it\e _ _ 
(15) (c)=3 _? (1 +e')*dt, a>—-l1l—-o<x<0o. 


This is of the form (11) with r = — 2, s = x; the Fourier series of (1 + e ‘)* con- 
verges for | t| <a by almost any convergence test that we might think of applying. 


22 R. P. BOAS AND H. POLLARD [January 


Then (10) takes the form (5). 
Pollard and Shisha also give 


(16) py ( " jeioror = | ( ° )etoror du = (1+ ey 


n=~o\t FC ut+c 
under the same conditions as (5); this is (10) again with 


e7 fet (1 4 el), | t | <1: 
G(t) = 

0, | t| > x. 

The pair (7), (8) follow in the same way, once we realize that (15) can be trans- | 

formed into 


, 


: 2n 
eins sin 7 (a + x) (" + xX 1 1 (1 _ el')~* edt 


Sin 7a x 2m Jo 


by the usual formulas about the gamma function. 


5. A general method. The preceding discussion suggests a general method 
for constructing continuous analogues of series. Consider a function f defined 
on the integers, and let ,~ _,, f(n)e’™ be the Fourier series of an integrable function 
F, so that 


f(r) = 5 [ e~ ™F(u)du. 


There are many conditions that are sufficient for this, for example that x | f (n)| 
converges or that f(n)->0 and f is even and convex ([13], vol. I, pp. 183, 326). 
The function ¢ defined for all real x by 


1 . —ixu 
d(x) = os [ e F(u)du 
interpolates f at the integers and has the form (11). Consequently we have 


ie @) ie @) io 8) 
~ fme"™= YD d(ne™ = o(xje*dx,  [t|<z. 
n= — 00 n= — 00 — 

That is, we can construct a continuous analogue of any Fourier series that belongs 
to a function satisfying the conditions imposed on G in §2. Whether we are willing 
to regard this as a reasonable analogue seems to depend on whether we can write 
@ (x) in a sufficiently recognizable form. 

Let us see how the method works out in some specific examples. 

We first look for a continuous analogue of the logarithmic series, which we can 
write in the form 


_ n-~1 ; 
it= » (oY gin |t] <x. 
n#0 n ° 


1973] CONTINUOUS ANALOGUES OF SERIES 23 


Here 


F(n) ue du, nO; f(0) = 0; 


ih 
3 
a 


F(u) = iu, 


1 Ninn _ 1 sin 1x 
o(x) = oo [ e 'F(u)du == (cos nx x ). 


It is clear that d(n) does in fact equal f(n), although there is no really natural com- 


pelling analogue of (— 1y'~* /n, and @(x) is perhaps not the most obvious inter- 
polating function. Our continuous analogue is 


00 


_ n-~1 ; : 
>» CHU pint = | (cos — <2") 
n nn 


n#0 n — 00 


I 


fe 8) ° 
1 sinwx\ ; 
| — {cosmx — ———Je™'dx = it, || <7. 
x TX 


— 00 
This seems acceptable; indeed, ¢ is the only possible interpolating function of the 
form (11). 

Now let us look for a continuous analogue of the exponential series, 


o 4, 
ps je" = exp(e”). 
n=O . 


Here 


(17) b) = [ e™exp (edu, 


and we have ¢(n) = 1/n!, n = 0,1,2,---; O(n) = 0, otherwise; 


2 — = [eax = [ e “exp (e)du. 
This, although formally a pair of analogues, seems unsatisfactory, partly at least 
because we are conditioned to expect a continuous analogue of 1/n! to involve 
1/['(x + 1), whereas (x) = 1/[(x + 1) only when x = n. 

We note that a function ¢(x) of the form (18) is easily seen to be (a) bounded 
on the real axis, (b) of exponential type in the plane, ie., | d(z)| < Ae™*l, But 
1/F(x + 1) does not satisfy either (a) or (b), for example because 1/[( —-n +4 + 1) 
= (—1)""'T(n —4)n- = faster than any e®” (by Stirling’s formula). 

We should accordingly like to put @(x) into a form that involves 1/[‘(x + 1) 
explicitly. Now a well-known formula (see [3], vol. 1, p. 13) states that 


1 1 (QO+) 
— tS | ted 
Tiz+i1) 27 J-, ° 


24 R. P. BOAS AND H. POLLARD [January 


where the path of integration can be taken to be the loop extending from — 00 to — 1 
along the real axis, around zero on the unit circumference, and back to — oo. This 
yields 


1 sinnx [° e°" 
P(X) = ri+x) 2 | ghee, 
and consequently with this (x) 
» < em = » d(nje” = | b(x)e™ dx. 


n=Q 


5. Bessel functions. The generating function for the Bessel functions J,(s) is 


exp(s(w—1/w)) = L w',(s), 


n=—~o 


or, with w = e”, 


(18) exp (4s(e" — e~")) = e e'"] (5). 


h=~ 0 


Let us look for a continuous analogue of (18). Bessel’s integral ([12], p. 19) is 
JAS) = <. I cos(né — ssin@)dé, 
0 


when n is an integer. Replacing n by x yields a function of the form (11); unfortuna- 
tely it is not J,(s) when x is not an integer, but is known as Anger’s function J,(s) 
({12], p. 308). What is true is that ({12], p. 176) 


J,(s) + h,(s) = — i) cos (x8 — ssin@)dé, 


where 


h(s) = BEE | eestor dr = I,(s) ~ JG), 


and h,(s) = 0 when x is an integer. Hence an analogue of (18) is 


E eMU,6) + ho) = |e Ua) + yl) 


re @) 


or alternatively 


xe"J,(s) = LX el™J,(s) = | eI (s)dx, 


= 


which is no less ‘‘natural’’ than the pair (7), (8). 


1973] ENGLAND WAS LOST ON THE PLAYING FIELDS OF ETON 25 


Added in proof: For similar results see [14]. 


References 


1. A. B. Bhatia and K.S. Krishnan, Light-scattering in homogeneous media regarded as 
reflexion from appropriate thermal elastic waves, Proc. Roy. Soc. London, Ser. A. 192 (1948) 181-194. 

2. R. P. Boas, Jr. and C. Stutz, Estimating sums with integrals, Amer. J. Physics, 39 (1971) 
745-753. 

3. A. Erdélyi, et al., Higher transcendental functions, McGraw-Hill, New York, etc., 1953. 

4. D. Jagerman, Bounds for truncation error of the sampling expansion, SIAM J. Appl. Math., 
14 (1966) 714-723. 

5. L. B. W. Jolley, Summation of series, 2d ed., Dover, New York, 1961. 

6. K.S. Krishnan, A simple result in quadrature, Nature, 162 (1948) 215. 

7. , On the equivalence of certain infinite series and the corresponding integrals, J. Indian 
Math. Soc., (N.S.) 12, (1948) 79-88. 

8. L.H. Loomis, An introduction to abstract harmonic analysis, Van Nostrand, New York, 
etc., 1953. 

9. B. O. Peirce, A short table of integrals, 3d ed., Ginn, Boston etc., 1929. 

10. H. Pollard and O. Shisha, Variations on the binomial series, this MONTHLY, 79 (1972) 
495-499. 

11. E. C. Titchmarsh, Introduction to the theory of Fourier integrals, Oxford, 1937. 

12. G. N. Watson, A treatise on the theory of Bessel functions, 2d ed., Cambridge, 1944. 

13. A. Zygmund, Trigonometric series, 2d ed., Cambridge, 1959. 

14, T. J. Osler, An integral analogue of Taylor’s series and its use in computing Fourier 
transforms, Math. Comp., 26 (1972) 449-460. 


ENGLAND WAS LOST ON THE PLAYING FIELDS OF ETON: 
A PARABLE FOR MATHEMATICS 


A. B. WILLCOX, Executive Director, MAA 


{ am sure that most of you have had the experience at one time or another of 
discovering, in an unexpected place, an old newspaper, its pages yellowed with age. 
You may have found that a glance at one of the old news articles jolted your mind 
into a moment or two of serious reflection on how far we have come since those 
bygone days. I was rummaging through the attic of my imagination recently when 
I came across a newspaper dated May 1, 1980, its pages pale with the years not yet 
lived. A glance at an article I found there jolted my mind into something more than 
a moment of reflection on where we are going. It was, it seemed to me, .... 


Alfred Willcox received his Yale Ph.D. under Charles Rickart. He served as Instructor through 
Professor at Amherst College and has held Visiting appointments at the Univ. of Chicago, the Univ. 
of Uppsala, Sweden, and the Univ. of Wisconsin. His main research interest is functional analysis. 

He is presently the Executive Director of the MAA and previously served the MAA on a number 
of committees, as Second Vice-President, and as Executive Director of CUPM. He is the co-author 
and editor of the Willcox, Buck, Jacob, Bailey Calculus Series (Houghton Mifflin, 1971). Editor. 


26 A. B. WILLCOX [January 


1. A parable for mathematics. 


LONDON, May 1, 1980. It is only a coincidence, but an interesting one, nevertheless, 
that the eve of the first Congress of Committees for a New Beginning should fall on 
the birthday of the Duke of Wellington. An event on which is focused the hope of 
60 million inhabitants of this island for a new destiny falls on the day of birth of one 
of the central figures of that destiny which was once England. England: the hub of a 
global empire, the dream of a great culture, a nation which inspired love, devotion, 
sacrifice and pride, an idea which died a quiet death from natural causes on March 
23, 1979. This reporter was so intrigued by this confluence of events that he paid a 
brief visit recently to one of England’s last Lords, Jefferey Allyn-Smythe, Lord of 
Devonshire, Member of Parliament on the day England died. 

We chatted, in quite an informal way, about Wellington’s England, about the 
ideals and values which once were the heart and spirit of England, about how these 
ideals gradually became institutionalized in an Establishment which, once it had 
become the sole custodian of the spirit and destiny of England, lost its ability to 
change with the world and adapt to the real needs and desires of the people until it 
finally died quietly and without fanfare in a vapor of irrelevance. 

At one point in our conversation I mentioned those words of Wellington which 
had once stirred men’s hearts, ‘““The battles of England are won on the playing 
fields of Eton.’’ ‘““That sums it all up,’’ exclaimed Allyn-Smythe, ‘‘The rise and 
fall. It’s all there. Wellington, you know, had more in mind than just strong bodies 
and stout hearts. Eton exemplified all that was grand about England. At Eton 
the cream of England’s youth was prepared, and prepared well, for the kind of 
service and unswerving loyalty and devotion that built a global Empire and 
developed a culture as solid as Gibraltar. At least, that was the way it appeared. In 
actual fact, during the first half of this century Gibraltar began to crumble. It was 
difficult, the realization that England was after all just a smallish island in a large 
sea, but we felt that we accommodated to this realization with grace and resolve. 
What we did not notice was that Eton did not accept the changes which were occurring. 
The cream of England’s youth continued to be prepared, and prepared well, for the 
kind of service and unswerving loyalty and devotion which was to preserve a myth 
for two decades. All through the 60’s and 70’s, Eton and the English Establishment 
which it represented protected and preserved a skeleton of quality, a dream of 
greatness, and a pretense of destiny while the world simply lost interest. Education 
at Eton actually increased in quality and increased in vigor, while Eton itself 
ignored all the signs that it was becoming irrelevant. The Establishment was still 
present in all its glory and all of its tradition when the world simply walked away. 

*‘We all know the events which signaled the death of England. In the election 
of 1979, there simply weren’t enough interested British voters to return a Parliament 
able to govern. The small band of bewildered M.P.’s who were left after the election 
— there were eleven, weren’t there — simply went home. For a year now this tiny 


1973] ENGLAND WAS LOST ON THE PLAYING FIELDS OF ETON 27 


island has coasted along on inertia. The bureaucracy has kept the wheels turning, 
the managers have kept the store, until the recent riots in a few densely populated 
areas, and a rising undercurrent of fear across the nation — excuse me, across the 
land — have led to the formation of the local Committees for a New Beginning 
which come together tomorrow to fashion a new destiny for a people waiting to 
become a nation again. 

*‘Whatever that destiny is, it is not Wellington’s. Millions of English hearts 
swelled with pride when he said, “The battles of England are won on the playing 
fields of Eton.’ I rather think that a Wellington of today might give us words steeped 
more in irony than pride, offering a challenge instead of proud congratulations: 
‘England was lost on the playing fields of Eton.’ We might add, if we profess to have 
hope, “England will be rediscovered in the hearts, and rebuilt by the hands, of English- 
men.”’ 

I must admit that I find this bit of fantasy, this concocted look into the future, 
a bit more embarrassing every time I read it. I wrote it in one of those flashes of 
revelation which occur with brilliant displays of light in the middle of a sleepless 
night and which turn out to be grotesque Alice-in-Wonderland doggerel in the cold 
light of the dawn. 

Nevertheless, having committed myself to this ridiculous title before really 
thinking about how I would feel standing here before you reading it, I decided that 
the only thing to do was to see it through with a straight face. I won’t be hurt if you 
smile a little, even condescendingly. 

I am not embarrassed, however, to suggest that this silly story may indeed be a 
parable for mathematics. Listen, for example, to this true story, equally insignificant 
by itself, which ought to give pause to anyone who has his eye open today and who 
professes an interest and stake in the future of mathematics in our culture. 

A friend of mine recently told me that he has been teaching an advanced calculus 
course after a number of years away from this mainstay of the undergraduate cur- 
riculum. He said that he dutifully included in the course a section on line integrals, 
because it seemed the thing to do and because the topic was contained in the book he 
was using. After he had given a particularly brilliant lecture on the subject one day, a 
student came to him and asked why line integrals were contained in the course—what 
did they have to do with anything outside of mathematics, or even outside of calculus. 
My friend began describing the usual applications to physics which give line integrals 
a prominent place in the honor role of “‘relevant mathematics.’’ The student inter- 
rupted impatiently, ‘‘I don’t know anything about physics, and, frankly, I don’t 
care much about it. | am an economics major. Can you describe any applications of 
line integrals in the social sciences?’’ My friend drew a complete blank and fell 
back on the old defensive position, “‘?’ll think about it over the weekend and 
report back to you on Monday.”’ 

He thought several times during the weekend and looked in a number of books 
which he had in his study, but had no success at all. On Monday he had to confess 


28 A. B. WILLCOX [January 


that to his knowledge there are no significant applications of line integrals in the 
social sciences. ““Then, why should I, an economics major, be forced to spend a 
week of my time on line integrals,’’ said the student, ‘‘a subject which I can’t relate 
to my own experience and which I find boring?’’ This sent my friend back to his last 
line of defense, ‘‘But it is beautiful mathematics of great intrimsic worth and appeal.”’ 
This explanation had always sent his students nodding back to their desks, but not 
this time. ““I am not interested in playing chess, no matter how challenging it is 
intellectually. I am in college to learn how to do something significant for society, 
and I am interested in what mathematics can do to help me. I don’t have time for 
chess.”’ 

*“Do you know,”’ said my friend, ‘‘I have heard that many times before and 
I have devised a hundred devastating rejoinders, but somehow, at that moment, I 
wanted to shrink to a point and disappear.’’ I am not interested in playing chess. 

At the end of the last century England literally encircled the globe. It gave to the 
world the riches of a highly developed culture and a great dream, and in return for 
this the world repaid England with riches of a more tangible kind. But for all 
its power and grandeur, England did not solve the pressing problems of hunger, 
poverty, and lack of shelter in the world and eventually the world simply walked 
away. That is history. It remains to be seen whether the slide into complete irrele- 
vance is to be continued. I don’t really believe that it is, but somehow it doesn’t 
seem impossible. 

In 1972, mathematics spans the world of science and much of the world that 
stands on the edge of science. It has given this world unity, coherence, powerful 
tools for reason, and much elegance. In return for this, the world has rewarded us 
richly — we would not be so ungrateful as to deny that. But somehow, we are 
aware that mathematics has not only failed to solve as many of the great problems 
as the world expected it to solve, but that it has actually begun to lose contact with 
much of the world outside its own little island. Is mathematics a game of chess? 
Will the world lose interest in playing chess? 

I would like to share with you a random selection of straws in the wind, drawn 
from journal and newspaper articles, letters, and other lore, which indicate the 
directions and the source of my concern, and then to state as succinctly as I can the 
clear challenge which this wind — still just a gentle breeze — wafts into my inquiring 
nostrils. 


2. Straws in the wind. 

Straw # 1. At its 1970 Annual Meeting in San Antonio the MAA presented a 
panel discussion on the reports of COSRIMS (Committee on Support of Research 
in the Mathematical Sciences). During this panel discussion Ralph Boas described 
COSRIMS as follows [1]: 


‘“‘What is—or was— COSRIMS? It was a 12-man committee of the National Academy of 
Sciences under the chairmanship of Lipman Bers, and the name stands for Committee on the 


1973] ENGLAND WAS LOST ON THE PLAYING FIELDS OF ETON 29 


Support of Research in the Mathematical Sciences. Before it was through it had 50 or more 
collaborators doing things for it, and another 50 or so worked on the CBMS Survey, with 
John Jewett as executive director, that collected the needed data. The principal product of 
COSRIMS was simply entitled, ‘““The Mathematical Sciences — a Report”; it was completed, 
after more than a year of work, at the end of 1967 and issued late in 1968.”’ 


In introducing the panel members, the Moderator, Arnold Ross, made the 
following comments on events which have transpired since the publication of the 
COSRIMS reports [2]: 


‘In introducing the COSRIMS Reports, Lipman Bers placed the then current major con- 
cerns of our mathematical community into a proper perspective with admirable clarity. 

The all pervasive nature of mathematics to which he referred is attested to by the variety 
of concerns exhibited by these reports. Accepting the premise of the inherent esthetic appeal 
of mathematics and the demonstrable need for mathematics in the sciences and the professions, 
the COSRIMS Reports projected an influx of young talent into mathematics on an ever- 
increasing scale. The problems of education and research, so it seemed, were amenable if 
only one could throw into the fray sufficient material resources and adequate resources of tal- 
ent trained to the level of a Ph.D. 

However, much has happened since the writing of the reports which put to a severe test 
the comfortable assurance of our mathematical community. 

There has been a strong alienation of young talent away from mathematics. Social unrest 
has reached the campus and our adequacy as mentors of the younger generation has been 
vociferously questioned. We have not been able to command material resources which we feel 
we need for the task of mathematical education at hand. Our expectations for the cream of 
our mathematical manhood, our young Ph.D.’s, have not been fulfilled — both in regard to 
what they expected and in regard to what has been expected of them. 

As a result of the above disturbing confrontation with the new realities, many voices have 
been raised to urge a critical reappraisal of our academic responsibilities. A grassroot move- 
ment emphasizing the need for a more sensitive response to the needs of our students and to 
the needs of our young mathematical colleagues, and of our colleagues of all ages in the 
related sciences has sprung up. The area in grass is not very large as yet but the grass is high. 
Some of our most accomplished colleagues have joined this movement, thus giving the lie to 
the all too common heresy that there exists an intrinsic conflict between teaching and 
research.”” 


STRAW # 2. Earlier, in 1966, Edwin Spanier [3] wrote a memorandum to 
CUPM expressing some of his concerns about the undergraduate curriculum. In this 
memorandum, Spanier said, 


“It is my contention that the universities and colleges are not doing a good job of educa- 
ting the undergraduate in mathematics.This is not because we do or don’t have certain courses 
available, but more because of the attitude of the instructors and the emphasis in the courses. 
Most of the undergraduate mathematics majors are probably reasonably qualified to go to 
graduate school in mathematics but not for anything else. The demand that existed for mathe- 
matics majors in industry seems to have disappeared, and I fear this is because industry 
has learned that mathematics majors aren’t as well trained for their needs as people with other 
majors. 

A mathematics major should learn something about what mathematics is, both in terms 
of its internal structure and in terms of its relations with other areas. An undergraduate would 
probably get a more rounded presentation of mathematics in the above sense if he studied 


30 A. B. WILLCOX (January 


engineering or computer science. This is unfortunate, and we should make a serious effort 
to change this state of affairs.” 

‘*... We all share responsibility for its state. I don’t know how to change the situation, but I 
am firmly convinced that if we don’t, we will find ourselves playing a progressively smaller 
role as that of the engineers and computer Scientists grows.....” 


STRAW # 3. At that same San Antonio meeting, R. D. Anderson presented a 
talk ‘‘Are There Too Many Ph.D.’s?”’ [4] He began his talk by answering the question 
as follows: 


‘“‘The answer is ‘Not Yet.’ However, a valid interpretation of such an answer is that there 
may well be ‘‘too many Ph.D.’s within a few years. In the author’s judgement, based on the 
evidence cited below, there should be positions for Ph.D.’s in mathematics for the next two 
years; however, a large percentage of the available academic positions will be in colleges or 
universities without Ph.D. programs in mathematics, without research libraries, and with 
teaching loads which are larger than those prevailing at Ph.D. granting universities. By the fall 
of 1972 there are likely to be more Ph.D.’s looking for positions than there are (adequately 
salaried) positions with duties commensurate with Ph.D. level training in mathematics.... 

“Over the last 25 years, at least, the principal thrust of graduate training in the mathe- 
matical sciences has been that of training in core mathematics, chiefly pure mathematics. We 
have trained Ph.D.’s in our own image, to regard research as the principal purpose of mathe- 
maticians and the primary (and almost the only) route to real status in the profession. In a 
sense we have been fabulously successful. American mathematics has been playing an increas- 
ingly important and central role in world mathematics. Research — and good research — has 
been flourishing as never before. We have inculcated our graduate students with the attitude 
that teaching was more a means to develop new researchers and a means of support of 
researchers than an end in itself. And, by and large, we have done little to encourage involve- 
ment of our graduate students with applications of mathematics. There is a real need for a 
rapid change of attitude and action on both of these counts. Many of our young Ph.D.’s will 
be employed primarily as teachers and many others will need to find positions outside 
academic life where applications of mathematics will be the basis for their continuedemploy- 
ment. Nationally we must alter some of our patterns of graduate education. It does not follow 
that each university or each graduate student should have a radically different program. It 
does follow that, statistically, many should.” 


Toward the end of his talk, Anderson listed several recommendations based on his 
observations. Among them were: 


(1) In order to provide the future Ph.D. with necessary options for employment, opportunities 
for substantial training in applications of mathematics (particularly the new applications) 
should be offered to (but not necessarily required of) graduate students in core mathe- 
matics departments. 

(2) Since academic employment of future Ph.D.’s will be more dependent on teaching per- 
formance, greater stress should be placed on training in teaching and for teaching. 

(3) Mathematics departments should actively promote the introduction of more and better 
service courses, i.e., courses for students in other disciplines. Not only will this accelerate 
the mathematization of society but it will also increase the demand for mathematics 
faculty.” 


STRAW # 4. Some of my straws fall obliquely on the issue. But sometimes 


light from the side reveals detail most sharply. In January of 1970 Alvin Weinberg 
[5] said in an article in SCIENCE magazine, entitled ‘‘In Defense of Science’’: 


1973] ENGLAND WAS LOST ON THE PLAYING FIELDS OF ETON 31 


“It is incredible, but true, that science and its technologies are today on the defensive. The 
attack, which is most noticeable in the United States, has been launched on four fronts. 
First, there are the scientific muckrakers, mostly journalists, who picture the scientific enter- 
prise as being corrupted by political maneuvering among competing claimants for the scientific 
dollar. Second, there are thoughtful legislators and administrators who see a waning in the 
relevance of science to the public interest, especially as we address ourselves to grave Social 
questions that are hardly illuminated by science. To deny connection between science and 
public affairs weakens one of the main arguments for public support of basic science: that out of 
basic science comes technology, which in turn improves our human condition. Third, there 
are the many technological critics who urge a slowdown, or at any rate a redirection, of 
technology because of its detrimental side effects. And finally, there are the scientific 
abolitionists: the very noisy, usually young, critics who consider the whole scientific-technolo- 
gical, if not rationalistic mode of the past 100 years a catastrophe. To them technology is 
the opiate of the intellectuals; some of the more extreme would demolish human reason, as 
the ultimate tool for achieving human well-being. The consequence, or perhaps, a further 
symptom, of all this harassment is a reduction in society’s support for science. The U. S. bud- 
get for science has fallen from 2.5 percent of the gross national product in 1965 to 2 percent 
in 1969.” 


STRAW # 5. This straw has been broken into several pieces. It concerns the 
current employment situation in mathematics and one interpretation of the meaning 
of this situation. 

An article in the April 26, 1970 Washington Post, [6] entitled ‘“‘Ph.D. Glut 
Creating a Jobless U.S. Elite’’ began: 

“*T just can’t find anything at all,’”’ complains a bitter young scientist who won his coveted 
doctor of philosophy degree at the University of Maryland this past winter and has yet to 
land a satisfactory job. 

“His plight reflects a dramatic development in higher education. The Ph.D. has suddenly 
ceased being a certain passkey to professional and sometimes financial rewards. The new 
Ph. D. recipient this year cannot be sure of finding any job that approaches what he had 
looked forward to during those long, arduous years of postgraduate study.”’ 

In an article ‘“Academic Employment Prospects for September 1972,’’ appearing 
in the February 1972 issue of the Notices of the AMS, R. D. Anderson reports 
the following balance sheet for the academic employment situation in September 
of 1972. This balance sheet represents an estimate based on recent surveys conducted 
by the AMS Committee on Employment and Educational Policy. 


A BALANCE SHEET FOR ACADEMIC JOBS IN MATHEMATICS 


Academic Job Seekers Academic Jobs Available 
1. 900 (new Ph.D.’s not already having jobs) 500 (from survey) 
2. 200 (currently professionally unemployed) 100 (death and retirement) 
3. 500 (nonretainees) 500 Gobs of nonretainees) 
1600 Total jobs seekers 1100 Total jobs 
4. -300 Nonpure mathematicians -300 
1300 Pure mathematicians seeking jobs ~ 800 Jobs for pure mathematicians 


5. Prospective professionally unemployed pure mathematicans: 
500 + 200. 


32 A. B. WILLCOX [January 


These two pieces of the straw describe the current uncomfortable employment 
situation for Ph.D.’s in mathematics. The final piece “‘explains’’ this situation in a 
certain admittedly simplistic and even frivolous way. But the “‘explanation’’ is not 
totally devoid of validity and has a moral! The COSRIMS report, to which I have 
referred earlier, contains several tables listing the number of earned B.A. and Ph.D 
degrees in the mathematical sciences each year from 1954 through 1966. The tables 
also contain projections of the predicted output of BA’s and Ph.D.’s through 1976. 
The predictions were toned-down versions of projections made by the U.S.O.E. 
after careful study and the application of the most sophisticated statistical techniques. 
Now we can compare these predictions with the actual output from 1966 to date. 
These comparisons are contained in the following two charts. 


40,000 
36,000 7080 
31,000 “a 
9 “Ac 
24,000 > 
21,200 aS 27,300 77,000 


-— 


20,000 + 18,700 24,100 


19,700 21,500 


16,100 


10,000 B.A.’s in the Mathematical Sciences 


63 64 65 "66 67 68 "69 "710 


Ph.D.’s in the Mathematical Sciences 


64 "65 66 67 "68 "69 10 ‘71 
Fig. 2 


1973] ENGLAND WAS LOST ON THE PLAYING FIELDS OF ETON 33 


Notice how different the situation is today from what was predicted. More Ph.D.’s 
and fewer B.A.’s. More teachers, and hence pressure for more mathematics classrooms, 
but fewer undergraduates interested in mathematics for mathematics sake. MoRAL: 
If we value our jobs, it behooves us to see that our classrooms are filled with a greater 
concentration of students whose first loyalties lie elsewhere. 


STRAW # 6. Ina letter [8] to mein 1970, a loyal MAA member and MonrTHLY 
reader expressed a concern about a disturbingly one-sided attitude towards mathema- 
tics which he detected in the pages of the MONTHLY. He said: 


“Quite generally, recent issues of the MONTHLY seem to suggest feelings of self-complacency 
and smugness of some mathematicians which I can’t believe are shared by most MAA mem- 
bers. Certainly, the exhibition of such feelings is not conducive to that attitude of increased 
social awareness and responsibility which in the opinion of some is demanded by the times. 
And by this I mean neither Viet-Nam nor Chicago, but rather, say the teaching of mathematics 
to non-mathematicans. Could it be that we need a little more insecurity ?”’ 


STRAW # 7. At the 1970 San Antonio Meeting, W. L Duren [9] presented 
some slightly different views on the question “‘Are There Too Many Ph.D.’s in 
Mathematics?’’ During his talk he said: 


“It is fair to say that, in the Federal planning which directed the support for an expanded 
graduate program in mathematics, there never was any indication that what was needed was 
more Ph.D. mathematicians who were traditionally trained to do research in some narrow 
field of pure mathematics. At best these research specialists were needed in greater numbers 
only as machine tools to produce more mathematicians. The real social needs for more 
mathematicians all came from the peripheral aspects of the field: for the computer revolu- 
tion, aerospace efforts, optimization in engineering design, business management, for con- 
ceptual models to push forward in life and social sciences, and for more teaching to help 
young people to get jobs in a technical world. Besides this, an expanded graduate program in 
mathematics was needed to support the production of more Ph.D.’s in engineering and 
physical sciences.” 


STRAW # 8. The February 7, 1971 issue of the Washington Post contained 
an article by Daniel S. Greenberg [10] entitled ‘‘Prestigious Science has Feet of Clay.”’ 
It was a clear statement of the frustration which the ‘‘other’’ world feels with a 
science community which seems always to consume large amounts of money and 
prestige, pour forth large quantities of advice, but not to relieve in any systematic 
and regular way the awful suffering of Mankind. The article begins: 


**The Republic has no need of scientists,’ declared the president of the French revolutionary 
tribunal as he sentenced the chemist Lavoisier to the guillotine. The wish that it were so may 
have strayed across the minds of some present-day officials, including Messrs. Nixon, Kosygin 
and Brezhnev and perhaps even Mao, as each in his own setting has confronted that peculiar 
and restless conglomeration known as “‘the scientific community.” 

“Seemingly insatiable for public funds, prickly and righteous when the public seeks a say 
about the use of those funds but frequently unhesitant to pronounce on political and social 


34 A. B. WILLCOX [January 


issues, scientists constitute a group that is at once indispensable and often indigestible. Robert 
M. Hutchins encountered the tribe in his long ago days as head of the University of Chicago, 
and later concluded: ‘A scientist has a limited education. He labors on the topic of his dis- 
sertation, wins the Nobel Prize by the time he is 35 and suddenly has nothing to do... He has 
no alternative but to spend the rest of his life making a nuisance of himself.’ ”’ 


For a touch of humor with a sharp edge, Greenberg’s article is printed next to the 
following cartoon after the late Rube Goldberg. 


’ <n 
m 


NO // 


\ | x M oO N aa N QQ‘, 
~ 2 
g y 
IF i i (ai. < 7) iy oe 7 
bar ‘Qe, 


OF 
ie Ly 
L al as a= 
7g | eae al == 


RUSSIAN ENGINEER (A) PRESSES BUTTON, BRC FOR A SHARE AND SET OFF : (K) 
B ROCKET (B) CARRYING SPUTNIK WHICH ORBITS ACTIVATING TREAOMILL(L) PRETTY GIRL (M) STARTS 
AMERICAN JOHN Q. PUBLIC (C) SCARING THE PANTS WALKING AND DISTRACTS YOUNG CHEMIST (N) 
OFF HIM. FALLING PANTS TRIP LEVER (D)OPENING WHO KNOCKS OVER USELESS SPACE EFFORT 
PUBLIC COFFERS (E) FROM WHICH COINS DROP INTO CHEMICAL (0) WHICH FALLS ON CANCEROUS 
OUTSTRETCHED HANDS OF SPACE-ORIENTED MOUSE CP), CURES HIM, AND PROVIDES 
m SCIENTISTS (F,G,H), WHO LET SOME COINS TRICKLE RESEARCHER(Q)WITH CANCER CURE WHICH IS § 
TO FLOOR WHERE EDUCATORS (I, J) SCRAMBLE WHAT EVERYBODY HAS BEEN LOOKING FOR: 


© 1970, American Chemical Society, Reprinted by permission from 
Chemical and Engineering News, Vol. 48, December 21, 1970, page 5. 


StrAW #% 9. A recent issue of the Notices of the American Mathematical 
Society contained a ‘‘Letter to the Editor’’ from Mary B. Williams. Her letter reads, 
in part, [11]: 


‘The thing that an interdisciplinary mathematician most needs is an understanding of the 
relationship of mathematics to the real world. The best mathematics comes from a deep 
intuitive understanding of the structure of some portion of the real world; pure mathematics 
is useful because it starts from mathematical structures which were abstracted from the 
structure of the real world by earlier, non-pure mathematicians. But at present mathematicians 
are taught that mathematics is a free creation of the human mind, justified not by its con- 
nection to the real world but by its own intrinsic beauty. I realize that this philosophy solves 
(or, rather, relegates to the nether regions) some extremely difficult problems concerning the 
connection of mathematics with reality, but it is false. (Possibly most first-rate pure mathe- 
maticians would agree that it is false; nevertheless, it pervades their teaching.) Mathematicians 
expect usefulness to be an inevitable by-product of mathematical beauty; they justify this 
expectation by historical examples of pure mathematics which turned out to be useful, but 
they have no understanding of why these examples turned out to be useful and consequently 
when they want to do something useful they work on the assumption that mathematical beauty 
guarantees ultimate usefulness. This leads them to feel that a superficial understanding of the 
real world problem, together with their own mathematical creativity, is all that is necessary to 
do worthwhile work; and it leads them to reject as irrelevant the objection that their results 
don’t solve any problem the scientist is interested in. Naturally the science departments do 
not want to hire mathematicians with this attitude.” 


Straw #10. At the 1971 MAA Summer Meeting in Laramie, Lowell Paige 
[12] gave a talk entitled ‘“‘Public Understanding of Science and Its Implications 


1973] ENGLAND WAS LOST ON THE PLAYING FIELDS OF ETON 35 


for Mathematics.’’ Every thoughtful member of the Association should read his 
talk which is reprinted in the MONTHLY, February 1971. Toward the end of his re- 
marks, Dean Paige made the following comments: 


“Let us look at some of the consequences which the mathematical community must face. 
First, it is to be expected that a major portion of any additional funds recommended for the 
National Science Foundation this year will be assigned to interdisciplinary programs directed 
at the problems of society. Even a casual reading of the testimony before the Special Sub- 
committee on NSF of the Senate reveals this fact. Hence, the fiscal support available for funda- 
mental research in science, including the mathematical sciences, will remain approximately 
the same as last year.... 

“The most widely discussed criteria proposed for the assignment of priorities to scientific 
research are those advanced by Dr. A. Weinberg of the Oak Ridge National Laboratories. The 
criteria of justification proposed for the support of science were: technological merit, scien- 
tific merit, and social merit. It is in the discussion of scientific merit that he states, ‘I would 
therefore sharpen the criterion of scientific merit by proposing that, other things being equal, 
that field has the most scientific merit which contributes most heavily to and illuminates most 
brightly its neighboring scientific disciplines.’ 

“Tf the preceding is taken without modification as a reasonable basis for the allocation of 
funds within the National Science Foundation, then the mathematical sciences section will 
need all of the assistance the professional societies can provide for the justification of their 
requests. To illustrate my concern, I note that in Weinberg’s discussion of scientific merit 
preceding his recommendations he appeals to the following comment of von Neumann: ‘As 
a mathematical discipline travels far from its empirical source, or still more, if it is a second or 
third generation only indirectly inspired by ideas coming from reality, it is beset with grave 
danger. It becomes more and more pure aestheticizing, more and more purely /’art pour Part. 
This need not be bad if the field is surrounded by correlated subjects which still have closer 
empirical connections or if the discipline is under the influence of men with an exceptionally 
well developed taste. But there is grave danger that the subject will develop along the line of 
least resistance, that the stream, so far from its source, will separate into a multitude of insigni- 
ficant branches, and that the discipline will become a disorganized mass of details and 
complexities.’ 

“To be brief, the appeal is to relevance; not in the sense attached to relevance by students 
but in the intellectual context of unifying concepts. I do not interpret von Neumann’s remarks 
to be aclarion call for slavish devotion to the applications of mathematics; but I am certain 
that Weinberg and other scientists are not so inclined. 

‘““Many mathematicians have expressed the need for our courses and research efforts to 
reflect the relation between various areas of mathematics as well as to the applications to 
other disciplines. 

“IT would propose that our writing include more than a feeble pass at articles designed to 
illustrate the unifying aspects of abstract concepts for the non-mathematical scientist. The 
initial effort of the COSRIMS reports must be continued if we are to convince our scientific 
colleagues that the plea in Hardy’s toast, ‘‘Here’s to pure mathematics. May it never have any 
use,”’ has not been fulfilled. Thus, I find articles of the nature of Saunders MacLane’s in the 
June/July issue of the MONTHLY to be of considerable importance.... 

“IT have devoted considerable time to what might appear to be the selfish interests of 
faculty members. Now I wish to consider the important component of our concern: the stu- 
dents. What will be the effect of present attitudes upon our students ? 

“There is no doubt in my mind that the growing contention that science and technology 
are insensitive to our social problems is driving undergraduates from Science and Mathematics 


36 A. B. WILLCOX [January 


to the Social Sciences. This can only result in further alienation from mathematics and I 
submit that one of our curriculum disaster areas is in courses designed for non-mathematics 
majors in addition to those service courses we provide for the various disciplines. 

‘‘Even if we choose to ignore the nonmathematics majors, our undergraduate majors 
cannot help but notice the reduction in fellowships and research assistantships for graduate 
study. It is estimated that the reduction will be approximately 20% this year. Is it any wonder 
that students are discouraged when the prospects for support are diminishing? And to this 
distressing note, we might add the publicity of an oversupply of Ph.D.’s which has been widely 
discussed in the mathematical community. 


StRAW # 11. This last straw is anecdotal and also anonymous, because for 
all I know the story is still being acted out. A friend, chairman of the mathematics 
department at a major university, recently told me that the chemistry department in 
his university is currently debating a proposal to eliminate the requirement that 
its majors take a sophomore course given by the mathematics department. 
The proposal is to require instead that they take a mathematics course offered by the 
chemical engineering department. The mathematics department, they say, does not 
teach the mathematics they wish their majors to have and the chemical engineering 
department does. “‘I can hardly criticize the quality of the mathematics course the 
chemical engineers offer or their competence to offer it,’ says my friend. “‘After all 
the man in charge of it was formerly the chairman of the mathematics department at 
X University.’’ X is also the name of a major and respected university. 

When [| heard this story I instantly recalled a statement made to me years ago by 
the Head of a Civil Engineering Department: ‘“The Mathematics Department has 
us over a barrel, right now,”’ he said, ““but we are stockpiling mathematicians in our 
department for a future rainy day.”’ 

Have you taken a look at the weather reports lately? 


3. Where does the wind blow. The past quarter-century has seen unprecedented 
growth in mathematics, particularly in the United States. On the threshold of the 
70’s the discipline stands as a strong force in the world of science and ideas. This 
world has accepted willingly — almost eagerly — our pronouncements about the 
importance and power of mathematics in a scientific age. It has even accepted, 
albeit somewhat less eagerly, our claims about the beauty of mathematics. The widen- 
ing sphere of influence of mathematics in science has been noted by the world and 
the world has supported us in a style which befits our stature. 

However, there has been a sudden subtle shift in the wind. Dispatches bring news 
or unrest in some of the farflung colonies. The treasure ships have been returning to 
port riding somewhat high lately. The flow of recruits from the hinterlands has 
fallen off. The world has begun to remind us that when we claim that a discipline is 
basic central and powerful, we are referring to its impact outside of itself. The 
world grumbles that it cannot afford, at this moment in history, the luxury of elegance 
for elegance’sake. It hardly has time to play chess any more, sorry, good game and all 
that, but no time any more. 


1973] ENGLAND WAS LOST ON THE PLAYING FIELDS OF ETON 37 


The change in attitude of the world may well be very good for mathematics. 
Our discipline has always grown in cycles, which bring it closer at some times than 
others to the influences of the real world. Even in times of great abstraction and 
purity, mathematics drinks from the wells of science and society. But like any 
living organism, mathematics does have the option of remaining away from the well. 
Eton doesn’t have to change. It is only the consequences of such a decision which 
are inevitable. Suicide is only fatal, it is not unavailable. 

The future of mathematics in the next decade or two is not in our hands, yours 
and mine. It is in the hands of the students who populate our undergraduate class- 
rooms right now. Mathematics has no future unless they stay there and emerge at the 
end of four years with a desire for mathematics as a career. And today’s youth will 
not be bought by promises of elegance. They want to know what mathematics has 
to do with the price of eggs. 

On a more mundane level, but one which we certainly cannot ignore, the students 
in our undergraduate classrooms pay our salaries. It seems quite possible that fewer 
of these students will be mathematics majors in the immediate future, and it is a 
matter of hard and selfish economic necessity for us to pay more attention than we 
have to what we are saying to these students who are more interested in the price 
of eggs than in mathematics. 

This means a significant change in attitude in the teaching of undergraduate 
mathematics toward questions of the relationship of mathematics to other concerns. 
It is not just a matter of teaching more applied mathematics, or even of putting more 
examples and applications into our regular courses — although it does involve this. 
It is a matter of viewing our subject as a part of the larger efforts of man to describe, 
understand and control his physical, social and intellectual environment. 

What does this mean for the average teacher in the average classroom in the 
the average department of mathematics? I cannot give any tested recipes for relevance. 
I don’t believe that there are any. We are all individuals with individual tastes in 
mathematics. We adhere to different special fields within mathematics and these 
differences filter down even to our undergraduate classrooms. Relevance means 
one thing to a specialist in partial differential equations, another thing to a number 
theorist. Nevertheless, a few general observations might be made which apply 
to nearly all of us. 


J. Just turn over in your mind each day the fact that no part of mathematics 
is totally and inherently immune to applicability and totally divorced from inspiration 
from outside of mathematics. Be sensitive to and interested in any little bridge 
between mathematics and other concerns that you happen upon in your travels, 
even though it is a tiny, seldom traveled footbridge. Even if it is only a narrow plank 
it may be used to cross the interface and there may be an idea on the other side 
which sheds new light for you on mathematics. If you are sensitive to these little 


38 A. B. WILLCOX [January 


bridges, even though you have no need to cross any of them, that sensitivity will be 
passed on to your students, many of whom won’t accept mathematics without it. 

After I had given an earlier version of this talk, a young instructor came up to me 
and asked what he could do to make mathematics more relevant in his classroom. 
He is a number theorist, and he was in the fortunate position to spend most of his 
time teaching number theory to undergraduates. ‘‘How do you make number 
theory into applied mathematics?’’ he asked ruefully. I hadn’t thought about my 
answer to such a question in advance, so I could only stammer out some such 
general pep-talk as I have made immediately above. Thinking of it later, I wished 
I had remembered a beautiful bit of applied mathematics I had come across years 
ago during one of my rare experiences in teaching number theory to an undergraduate 
class. The example is in Ore’s charming book, “‘Number Theory and its History.”’ 
Indices modulo an integer are used to obtain a simple set of rules for splicing tele- 
phone cables in such a way as to minimize interference, or “‘cross-talk’’, between 
circuits. Encountered unexpectedly in the middle of a course in the purest of pure 
mathematics, the application is a refreshing breath of air from outside. Of no par- 
ticular importance by itself, it does shine a new light on the mathematics. 


2. When you serve on a text-book selection committee insist that one of the 
elements to be considered in evaluating a given book is the picture it presents of 
mathematics in a broad intellectual context. How well does the book handle the 
available ties between that particular mathematics and other concerns, both for 
input and for output, motivation and application. Clearly, this isn’t the only criterion 
for choosing a book and in many cases it may be a minor one, but don’t leave it 
out of the equation. 


3. In choosing topics for undergraduate seminars and colloquia, choose subjects 
which involve cross-fertilization between mathematics and other fields. This may 
take you far from your own field of competence in mathematics, but it is re- 
freshing sometimes to sit in a seminar in which you know very little more than the 
students about the mathematical subject under discussion. One of the masters in 
finding intriguing ties between very pure mathematics and an amazing collection 
of other fields is Victor Klee. I urge you all to show his two films, ‘‘Shapes of the 
Future, I and Il’’ to your undergraduates. He can find fascinating connections 
between unsolved problems in geometry and combinatorics and problems of current 
interest in solid state physics, virology, organic chemistry, botany and other fields. 
A number of these are reported in the Research Problems section of the MONTHLY, 
a rich source of mathematical topics with frequently unexpected applications to 
other fields. 

4. Be open and sensitive to situations in which inter-disciplinary courses or 


seminars would be feasible and interesting on your campus. Our habits of disciplinary 
thinking are so strong, that inter-disciplinary courses are not easy to establish and 


1973] ENGLAND WAS LOST ON THE PLAYING FIELDS OF ETON 39 


are often difficult to sustain. In the presence of students, our worst chauvinistic 
instincts have a way of coming to the surface. However, as a one-time participant 
in an inter-disciplinary course which lasted more than ten years, I have come to 
believe that no matter how brief the marriage, both partners (to say nothing of 
the students) reap rich benefits. 


5. Establish cordial relations with individuals in other departments. Attend 
their seminars and colloquia occasionally. Suggest subjects for their seminars or 
colloquia when you know of topics in their fields where significant mathematization 
has occurred. Invite them to attend your seminars and colloquia when speakers 
or topics appear which might interest them even peripherally. 


6. Ask yourself ten times each week, whether the cross-fertilization between 
mathematics and other fields, is not a subject of potential interest for any mathe- 
matician regardless of his mathematical interest. It may not become a central part 
of his mathematical life, but neither are many other things he finds intellectually 
interesting. For that matter, the biologist and even the physicist have not traditionally 
found the mathematicians’ first love all that interesting. But is it not possible — not 
probable, perhaps, but just possible — that the reason for the lack of interest from 
our scientific colleague, is that no mathematician has ever been willing to talk with 
him, in terms comprehendible to the intelligent layman, about what he is doing? 

I have put your patience to a severe test in dragging you through parable and 
metaphor. I ask you just once more to give your imagination free rein, as we make 
one more stop at the game of chess. Suppose someone were to discover that the rules 
and strategies of chess lead, on proper interpretation, to a strategy for traffic control 
which solved forever the problem of optimizing the flow of traffic in a large and 
congested urban area (the chess board) so that large masses of vehicles could move 
from one area to another, with virtually no accidents and absolutely minimal delays. 
What would this do for chess? (It is obvious what it would do for modern urban 
society.) Bobby Fischer and Boris Spassky would probably give scant recognition 
to the new applied chess. In fact, they and other chess purists would probably treat 
the development with disdain and even alarm. Traffic control is, after all, far from 
what chess is all about. But I wonder if the publicity alone would not cause a boom 
of unheard of proportions in chess playing. | wonder if sales of chess sets wouldn’t 
double over a period of years, support of chess clubs triple, grants in basic research in 
chess quadruple. It is not unlikely that the world’s champion chess player of the 
1980’s might be attracted to the game in 1971 because of this startlingly new bridge 
between the real world and the red, white and ivory world of the chessboard. Who 
would come off better in this game, the traffic engineers or the chess players? 

I have tried to present some of the background and reasons for a conviction 
which is growing in my mind that some of the most significant events in mathematics 
in the 70’s may well occur in the undergraduate classrooms of our nation. Not only 
do we begin the process of procreation there, we also implant indelible impressions 


40 F. P. CALLAHAN [January 


of the nature and value of mathematics to a significant segment of the world. If 
mathematics stands in the world like England did at the turn of the century, then 
we — you and I — are presiding over the Etons of Mathematics. If we lose contact 
with the essential ferment which is going on out there, then the world may simply 
walk away, not only from us but from mathematics. It is a challenge, a very real 
challenge, and it is one reason why I am proud to be associated with an organization, 
the MAA, which was created and exists exactly and exclusively to help us meet that 
challenge in the classroom. I hope that you will share my pride and that together we 
can insure the strength of mathematics for tomorrow by making it vital, meaningful, 
and (please forgive one last use of the overworked word) relevant for our students 
today. 


Based on a talk presented at the meeting of severai Sections of the MAA during 1971-72. 


References 


1. AMERICAN MATHEMATICAL MONTHLY, Volume 77, No. 6, June/July, 1970, p. 623. 

2. AMERICAN MATHEMATICAL MONTHLY, Volume 77, No. 5, May 1970, pp. 514—515. 

3. Internal C. U. P. M. document. 

4, AMERICAN MATHEMATICAL MONTHLY, Volume 77, No. 6, June/July, 1970, pp. 626-641. 

5. SCIENCE, Volume 167, 9 Jan., 1970, p. 141. (Reprinted with permission of the publisher, The 
American Association for the Advancement of Science.) 

6. The Washington Post, Sunday, April 26, 1970, p. BS. (Reprinted with permission.) 

7. Notices of the American Mathematical Society, Volume 19, No. 2, p. 119 © 1970. (Reprinted 
with permission of the American Mathematical Society.) 

8. Letter to A. B. Willcox from Peter Henrici. 

9, AMERICAN MATHEMATICAL MONTHLY, Volume 77, No. 7. June/July, 1970, pp. 641-646. 

10. The Washington Post, Sunday, February 7, 1971, p. Cl. (Reprinted with permission.) 

11. Notices of the American Mathematical Society, Volume 18, No. 3. pp. 502-503, © 1971. 
(Reprinted with the permission of the publisher, The American Mathematical Society.) 

12. AMERICAN MATHEMATICAL MONTHLY, Volume 78, No. 2, February 1971, pp. 130-142. 


MATHEMATICAL NOTES 
EDITED BY ROBERT GILMER 
The present backlog for this Department is substantial, Until further notice, new manuscripts 


cannot be accepted. This moratorium will probably continue until June 1, 1973; authors are 
requested to hold their manuscripts pending a further announcement. 


AN IDENTITY SATISFIED BY DERIVATIONS OF A PURELY INSEPARABLE FIELD 


F. P. CALLAHAN, Pennsylvania State University 


Introduction. Let k be a field such that char k 4 0 and let k(«) be the purely 
inseparable extension of k obtained by adjoining to k an « which satisfies the ir- 


1973] MATHEMATICAL NOTES 41 


reducible equation «? = a, where a is in k. Let @ be a k-linear derivation of k(a) 
(so that (cx) = cd(x) or, equivalently, d(c) = 0, for c in k and x in k(a)). Then 
it is easy to see that if @ # O there exists a unique wu in k(«) such that @ = (1/u)D, 
Where D (which may appropriately be called d/da) is the k-linear derivation of 
k(a«) for which D(a) = 1. The result proven in the paper can now be stated: 


THEOREM. Let ¢ (#0) be a k-linear derivation of k(a) so that there exists a 
unique u in k(a) for which @ = (1/u)D, where D = d/da. Then @ satisfies the 
identity: 

dh? = bd, where b = —D?™'(u)/u?. 


Proof of Theorem. The general idea of the proof is to view @ as a k-linear cndo- 
morphism of the underlying vector space of the algebra k(a). 

As a first step in the proof we remark that (1,x,x’,---,x?~') is a basis for k(«) 
over k if and only if x is not in k. This is easily shown. Next we consider two cases: 
in the first @ is nilpotent and in the second it is not. 


Case 1. @? = 0. In this case, by considering (a), o7(a),-:-, 6? '(a), we find 
an n such that @"~1(a) # 0 but 6"(a) = 0. If @"~'(a) = c, then c must be in k, 
because otherwise the remark in the paragraph above shows that @ is identically 
zero. Thus, if y is defined to be ¢"~7(a/c), then @(y) = 1 and @ is d/dy; also, 
d@ = (1/u)D where u = Dy. Since D? = 0, as is easily seen, this gives D?-'u = 0 
so that the b defined in the statement of the theorem is zero in this case and the 
theorem is seen to be true in Case |. 


Case 2. 6? # 0. In this case it is evident that @ must have at least one non-zero 
eigenvalue; let such an eigenvalue be 4 and if it is not already in the groundfield k 
adjoin it to k to obtain a larger groundfield k’ and algebra k’(a). The extension 
of @ to a k’-linear derivation of k’(«) presents no problems, and we continue to 
call it 6. Now let x be an eigenvector of @ such that @(x) = Ax. Since ¢ is a deri- 
vation this implies that @(x') = ix'~'@(x) = idx’, so that (1,x,x*,--+,x?7!) is a 
complete set of eigenvectors for @, the corresponding eigenvalues being 
(0,4, ---,(p—1)A). Since i? = i (mod p), we see that 6?(x') = bd(x') where b = 1? 7 '. 
Since (1,x,-:-,x?~*) is a basis this implies that 6? = bd. Also, since the eigenvalues 
are distinct it implies that 6? = bé is the characteristic equation of @ and that it 
is the only equation of degree p satisfied by all the eigenvalues of @. 

To complete the proof it remains to show that b = —(D?~'(u))/u?. To do this, 
again let 2 be an eigenvalue of @ and x the corresponding eigenvector. Define the 
k'-linear operator T by the formula T = D+ AM(u) where M(u) is the operation 
of multiplying by u (so that M(u)(y) = uy = yu). Define z, by z, = T"(1), where 
‘*1’? is the identity of the algebra. Now an easy inductive argument shows that 
D"(x) = z,X. 

Since D? = 0 this implies that z,x = 0; since k’(«) is a field and x ¥ 0 this implies 
that z, = 0. That is, T7(1) = 0. 


42 VLADIMIR DROBOT [January 


Now the definition of T implies that T? = T,A? + T,_,A?-* +--+» + Ty, where 
T,, 1s the sum of all monomial operators of the form V, V,---V,, each V being 
either M(u) or D, and M(u) occurring m times and D occurring p ~— m times in 
each such monomial term. Thus the equation T’(1) = 0 becomes ¢,4? + t,_,A?7' 
+++» +t) = 0, where t; = T,(1). 

Since 4 is any eigenvalue of @, we see that the equation above in 4 can only 
differ from the characteristic equation, 0? — b@ = 0, of ¢, by a multiplicative con- 
stant. Thus, of the coefficients to,t,,---,t,, all but t; and t, must vanish and we 
must have b = —t,/t,. 

Now t, and ft, are easily computed as follows: 


t, = T,0) = (M(u))?C) = u? 
and 


t, = (M(u)D?~! + DM(u)D?-7 +». + DP-1M(u)) (1) =0+0+4-- + D?-!(u), 
Thus, b = —D?~'(u)/u? and the proof is complete. 


RemaRKS. (1) The vanishing of all the t’s except t; and t, provides some un- 
obvious identities. An alternative proof of these identities can be given by means 
of the following identity which holds good in any commutative ring of characteris- 
tic p: 

Let ¢ be a derivation of R and let u be a member of R. Let operator S be defined 
by S(y) = o(y) +uy, where uy is the product of u and y in R. Then 
S?(y) = o?(y) + u’y + vy where y is any element of R and v = $?7'(u). 

This identity can be obtained from one given by Jacobson (Lie Algebras, Inter- 
science Publishers, page 187, equation (63) with a and b replaced by D and AM(u), 
resp.). 

(2) The referee has kindly pointed out that the equation ¢? = b@ but not the 
explicit evaluation of b can also be obtained by way of Jacobson’s Galois Theory 
for purely inseparable fields of exponent one. (See N. Jacobson, Lectures in Abstract 
Algebra, Van Nostrand, Princeton, N.J., Volume 3, page 190, equation 34.) 


ON SUMS OF POWERS OF A NUMBER 
VLADIMIR DrRosot, State University of New York at Buffalo 


In this note we investigate some approximation properties of polynomials whose 
coeffigients are 0, +1, or —!. The original problem, posed by M. Parnes [1] 
can be formulated as follows: Suppose we take a walk along the x-axis, starting 
from the origin, and at the time n we are allowed to take a step of length 0, + ¢”, 
or —t" (tis fixed). Which points can we approach as closely as we wish? If0 < t < I, 
it is clear that we cannot get outside the interval [—(1—1t)-', +(!—1t)~1+]. Since 
Ley 2 ty if4 <t<1, it is easy to prove that for such ft every point in the 


1973] MATHEMATICAL NOTES 43 


above interval can be approximated as closely as we wish. If t>2 and if ¢, = 0, 
+1, or —1, ey ¥ O then 


eo bette teyt™| SAY t tte +4) > M1 —-1)74) 


which tends to oo as N > oo. From now on we shall restrict our attention to the 
case 1 <t<2. Let FM be the set of polynomials with coefficients 0, +1, or —1 
and let A(t) = {p(t): peY}. The above remarks show that if 0<t<1 then 
At)<[-(—-)"*, (1—1t)-'] and if t = 2 then A(t) is discrete. For 1 <t <2, 
Y(t) is dense in the line for all but (possibly) a countable number of t’s as the fol- 
lowing result shows. 


THEOREM. If 1 <t<2 is not a root of any of the polynomials in # then A(t) 
is dense in the line. 


Proof. First of all it is enough to prove that 0 is a cluster point of the set P(t). 
Indeed assume that this is so. For any two positive integers n and k there can be 
found an integer m = n and a polynomial pe F such that 


(1) t~k-1 < t™p(t) < to". 


This is done by first choosing p€¥F so that 0 < p(t) < t-*~", and then taking m =n 


such that t-“""~' < p(t) <1t7*~™. It is clear that if p belongs to then so does 
+ x™ p(x). We construct now a sequence of polynomials {py} | Y no two of which 
have terms in common and every one of which satisfies 


(2) t-*-1 < py(t) < to". 


This is done as follows. Let p, EY be any polynomial in Y for which (2) holds. 
Assume the polynomials p,, p>,--:, Py are already chosen, that they have no terms 
in common, and they all satisfy (2). Let n be an integer larger than the degree of any 
of these polynomials and let py4,(x) = x”p(x) be a polynomial in FY satisfying (1). 
The sequence is hence defined inductively. It follows that p, +p,+---+pyeP? 
and 


Nt~*®-! < p(t) + + py(t) S Nt“. 


Since N and k are arbitrary and t > |, we see that any number can be approximated 
as closely as we wish, by elements of A(T). 

To prove that 0 is the closure of A(t) fix ¢ > 0, let A, be the set of polynomials 
in P of degree at most n and let F, be the set of polynomials in Y, with non- 
negative coefficients. There are 2”*1! polynomials in Y” and if p is one of 
them then p(t) lies between 0 and 1+---+2" = (t+! —1)(t-—1)7'. If p#q 
are in #,” then p—q is in F, and p(t) ¥ q(t). Hence by the pigeon hole principle, 
there are two polynomials p, ¥ q, in 7, such that 


(3) 0<| p(t) — an(t)| S 2-"@"*4 — 1)(t-1)"°. 


44 VLADIMIR DROBOT [January 


Since 1<t<2, the right hand side of (2) is less than e for large n. 

The set A(t) is not dense, however, for all 1<t<2. This was discovered by 
Prof. J. Isbell [2]. I am grateful to him for permission to include his example 
here as well as for some helpful conversations on the subject. 


EXAMPLE (J. Isbell). If t = 4(,/5 + 1) is the golden mean then A(t) is discrete. 


Proof. The golden mean ¢ satisfies the equation t* = t+ 1. We order all the 
polynomials pe lexicographically according to the decreasing exponents of the 
terms. More precisely, let 


P(X) = Ag Hay xX +e + ayx™, g(x) = by + bX Hee + bug x 


be two polynomials in Y. Let k be the largest integer for which | a, | A | b,| - We 
say that p(x) is of lower rank than g(x) if a, = 0 and b, + 0. If | a,| = | 5; | for 
all j, we arbitrarily say p is of lower rank than q if p(z) < q(z). If se A(t) let p.(t) 
be the polynomial of lowest rank such that s = p,(t), let h(s) be the degree of p,(t) 
and let «(s) be the last exponent before the first sign change. That is, 


s=t (Cae doves pe pel) t? + vee), 


We claim that B = «(s) —1. Moreover, all the terms with exponents less than « 
are present in the above representation and their coefficients alternate in sign. Indeed, 
if B <a—1 then the terms t*+ 01%" * — t*~? and 1% + 01%~ 1 + Ot*~* can be replaced 
respectively by t*~* and —t* + — t*~* both of which are of lower lexicographical 
order. Also if y>1 and t” and t’~‘ are present in p,(t) with opposite signs then 
t?-* must also be present with the same sign as t’. Otherwise the terms 
+ (t?— t?-*' 4 0?"*) and +(t?’—t’~' — 27) can be replaced by the terms + t’~? 
and QO, respectively, yielding a polynomial of lower rank. Finally then 


a- 1 


Ss = p,{t) = + a coe fe Pe > (—1)*> "2" 


n=0 
= + {ope +t (tH 1) (t+ 1743. 


It easily follows now that | s| —+ oo ash— o and so A(t) is discrete. 


References 


1. M. Parnes, Problem 44 15, Ridge Lea Problem Book, Mathematics Department, SUNY at 
Buffalo. 


2. J. Isbell, Private communications. 


1973] MATHEMATICAL NOTES 45 
A LOCAL MEAN VALUE THEOREM FOR ANALYTIC FUNCTIONS 
AKE SAMUELSSON, University of Gdteborg 


The classical mean value theorem of differential calculus does not extend to the 
complex plane. The purpose of this note is to establish a local counterpart for 
analytic functions. 


THEOREM. If f is analytic in a domain containing Zo then there is a neighbor- 
hood N of Z_ such that if z, is any point in this neighborhood then there exists 
a point z with 


| z— H(z + 24)| <4 Zz; ~ Zo|. 


such that f(z,) —f(Zo) = (21 — Zo)f'(z). 


A slightly weaker version of this theorem has been proved by J. M. Robertson [1]. 
As a matter of fact, with the additional assumption that f"(z,.) # 0, Robertson’s 
proof yields our theorem. 


Proof. We may assume that f has the form 
F(z) = f(z) + (z = 20) f' (Zo) + (Z = 20)" **h(z), 


where k = 1 is an integer and h(z,) ¥ 0. 
We may also assume, without loss of generality, that throughout the domain 


of analyticity we have 
| h(z)| = 4]h(zo)| and |h’(z)| <1. 


It suffices to show that if the neighborhood N = (z; | zZ—~ Zo| <r} is chose1 so 
that O<r <|h(zo)|/2(k +2) and z,eN, then the function 


_fE1) ~ fo) 
Zz 1 a 


f'(Z) 


has exactly one zero in the domain 


! Z—~ Zo 
#1 Zo], [are 
1 0 


<;I 


p= {2:] 2-40 +20] <4 i 


A direct computation shows that 


f'(2) — Lev —LE0) = D(z) + h(z,)W(2), 


a 
where ®(z) = (z — Zy)***h'(z) + (k + 1)(z — 29)‘(A(z) — f(z) and 


Wz) = (Kk + 1)(z — Zo). — (Zz; - Zo). 


46 J. A. HEINEN AND ALBERT WILANSKY [January 


If zedD, the boundary of D, then 


|@(z)| S$ |z-20/*** [h’'()] + (K+ Diz zo" [ w@ae| 
S (k+2)[2, —-20|"**. 


If z is on the circular arc of OD, i-e., if z = 4(z) + z,) + H(z, — z,)e'”®, | 0| Sa/k, 
then 


| Wz) ]?/] 21 — 20/7* = 1+ (k +1) ((k + Lcos*6 — 2cos kO)cos* 0. 
Using the inequality 
(k + 1)cos*@ — 2 coskO = 0 for | 6] S n/k, k = 1,2,--, 


readily established by induction, we see that |y(z)| =|z,—z0|*. If k>2, then 
the boundary @D contains two line segments, namely z = z, + t(z,—zy)e*'™"*, 
0 <t<cosz/k. On these line segments we have 


| W@)| = (+k + 1t)| 2, — 20/* 2 | 21 — 20] *. 
We have shown that | W(z)| = | Zi Zo|* on 0D. Hence, for z,éEN and zeaD, 
2) fg Kt? 1 LG 
iewe! * Jie nll? ~ 701 < 37a) = 


By Rouche’s theorem we conclude that the functions ® + h(z,)W and w have equally 
many zeros in D, namely one. This proves our theorem. 


Reference 


1. J. M. Robertson, A local mean value theorem for the complex plane, Proc. Edinburgh Math. 
Soc. (2) 16 (1968/69), 329-331. 


A THEOREM ON SET INCLUSION IN METRIC SPACES 
JAMES A. HEINEN, Marquette University, and ALBERT WILANSKY, Lehigh University 


Let A and B be subsets of a metric space (X,d). We shall show that under certain 
(essentially sharp) conditions, A will be contained in B if GdAc B. This result has 
applications in the study of the stability properties of certain differential equations 
and to,the variation of the spectrum of a Banach algebra element. 

For any set A in a metric space (X,d), let A’ denote the complement of A, 
C(A) the closure of A, and dA the boundary of A. 


THEOREM 1. Suppose A and B are relatively compact (i.e. C(A) and C(B) are 
compact) subsets of a non-compact metric space (X,d) with B’ connected. Then 
the condition 0A <B implies ACB. 


1973] MATHEMATICAL NOTES 47 


Proof. By a theorem of Hausdorff (a proof is given in [1], Theorem 1) we can 
give X an equivalent unbounded metric d,. A and B are still relatively compact, 
and hence bounded, in (X,d,). Now assume that dA c B and, for the purpose of 
contradiction, that there exists a point xe€A such that x ¢B, i.e. such that x € B’. 
Let D, = C(A) OB’ and D, = C(A’) OB’. Clearly, x €D,, so that D, # @, the 
null set. Since A and B are both bounded and (X,d,) is unbounded, it follows at 
once that D, # @. Furthermore, 


D, UD, = [C(A) OB’ U[C(A’) OB’] = B’. 


Now C(D,) = C[C(A) NB’| < C[C(A)] NC(B’) = C(A)NC(B’). 
Hence 


C(D,) ND, < C(A) NC(B’) ND, = C(A) NC(B’) NC(A') OB’ = CA NB’. 


But since GA cB, OA and B’ have no points in common, thus implying that 
C(D,)ND,<<dAOB' = @, and, in fact, that C(D,) ND, = @. In a similar 
fashion it may be shown that D, NC(D,) = @. Thus B’ = D, UD, where D, # @, 
D, # @, and where C(D,) ND, = D, AC(D,) = @. That is to say, B’ is the union 
of two non-void separated subsets. This contradicts the assumption that B’ is con- 
nected. Hence there can exist no point x € A such that x¢ B, and thus Ac B. 

To show that each hypothesis of the theorem is required, consider the following 
cases in which 0A c B and yet A ¢ B (in each case d is the usual Euclidean metric): 

(1) A not relatively compact. X = R*, A = {xeX: d(x,0) 2 1}, B= {xeXx: 
d(x,0) S$ 2}. 

(2) B not relatively compact. X = R?, A = {xe X: d(x,0) S$ 2}, B= {xeEX: 
d(x,0) 2 1}. 

(3) B’ not connected. X = R*, A={xeX: d(x,0) S$ 2}, B= {xeX: 
1 < d(x,0) S 3}. 

(4) (X,d) not non-compact. X = {x € R*: d(x,0) S$ 3}, A = {xe X: d(x,0) S 2}, 
B= {xeX: 1 S d(x,0) S 3}. 

As indicated earlier, this result has applications in the study of the behavior 


of solutions of ditferential equations [2]. Consider the n-dimensional vector dif- 
ferential equation 


(1) xX = f(x, 0), 


where it is assumed that fis sufficiently smooth to guarantee unique solutions which 
depend continuously on initial data. Let x(t;x ,t,.) denote the (unique) solution 
of (1) satisfying x(to;Xo,to) = X,. Under these conditions, x(t;-,t,) is a homeo- 
morphism from R" to R". Since set boundaries and compactness are preserved under 
homeomorphisms, it can readily be shown, using Theorem 1, that if S, is a com- 
pact subset of R” and S is a bounded subset of R” with S’ connected, then the con- 
dition x(t;0S_,t 9) < S implies the condition x(t; So,to) < S. This, of course, allows 
one to arrive at conclusions regarding the nature of solutions of equation (1) for 


48 J. C. AULT AND J. F. WATTERS [January 


all x,» € Sy by simply verifying these conditions for all x)» € dS,. As might be expected, 
this is of great interest when studying the stability of solutions of equation (1). 
Theorem 1 also leads to results in the study of Banach algebras. Let A be a 
Banach algebra with identity 1 and S a closed subalgebra with 1 eS. Then o(a,S) 
= {z€C: a—zl has no inverse in S} is, for each ae A, a compact subset of the 
complex plane C ([3], p. 261, Theorem 3). Let p(a, S) be the complement of o(a, S). 


THEOREM 2. If p(a, A) is connected, a(a,S) = o(a, A); thus o(a, S) is independent 
of S. 


Proof. ‘‘>”’’ is trivial. To prove ‘‘<’’, we note ([3], p. 266, Problem 23) that 
any boundary point of o(a,S) is in o(a,A) so Theorem | applies. 


COROLLARY. Suppose that for a certain ae A there exists S, such that o(a,S,) 


is real (or more generally, is nowhere dense and has connected complement); then 
a(a,S) is independent of S. 


References 


1. V. L. Klee, Jr., Some characterizations of compactness, this MONTHLY, 58 (1951) 389-393. 
2. J. A. Heinen, Set Stability of Dynamical Systems, Ph. D. Dissertation, Marquette University, 
Milwaukee, Wisconsin, 1969. 


3. A. Wilansky, Functional Analysis, Ginn-Blaisdell, Waltham, Mass., 1964. 


CIRCLE GROUPS OF NILPOTENT RINGS 


J. C. AULT AND J. F. WATTERS, The University of Leicester, England 


A radical ring R is equal to its Jacobson radical and is therefore a group under 
the o -operation given by 
aob=a+b+axb, 


where + and x denote the addition and multiplication in R. The group thus formed 
is called the circle group (Kruse, [3]). If R is a nilpotent ring of index n, say, that 
is R" = 0 (any product of n elements is zero) but R"~' # 0, then R is a radical ring 
and its circle group is a nilpotent group of class at most n — 1, as we show in the 
remark below. It is of interest to know which nilpotent groups arise as the circle 
groups of nilpotent rings. Kruse [3] has given necessary conditions for a finite 
nilpotent group to be a circle group and from these it can be deduced that not every 
nilpotent group of class 3 is a circle group. On the other hand, every Abelian group 
(nilpotent of class 1) is acircle group (of a zero ring, that is nilpotent of index 2, in 
fact, but also in many cases as circle groups of nilpotent rings of index greater than 
2). The purpose of the present note is to consider the case of nilpotent groups of class 
2. Kaloujnine [2] has already established that a large class of such groups are circle 
groups, but our method is more general and deals with all finite groups as well as 
some infinite groups. 


1973} MATHEMATICAL NOTES 49 


REMARK. To show that a nilpotent ring R of index n has a circle group which is 
nilpotent of class at most n — 1, we consider the chain 


(1) R> R* >-» >R"* > R” =0, 
as a series of subgroups of the circle group of R. 
If xe R*, where 1 < k <n, ye Rand y’eR is such that y’o y = 0, then 
yoxoy=x+twvWxRNxntxvnxyptwxKxxy 
belongs to R*. Hence R* is a normal subgroup of the circle group of R. Furthermore, 
if x’ € R is such that x’'o x = 0, then 
x'o y'oxo yeR*t} 
so that (1) is a central series in the circle group of R, which is therefore a nilpotent 


group of class at most n — }. 


Let G be a nilpotent group of class 2, that is the centre Z of G 1s such that the 
factor group G/Z is Abelian. We recall here that, in such a group, we have the 
commutator identities 


[ab,c] =[a,c][b,c] 
[a, bc] = [a,b] [a,c], 


where [a,b] = a~*b~'ab and a,b and c are elements of G. These identities will be 
used in the subsequent calculations without further reference. 


We begin by establishing a necessary and sufficient condition for G to be the 
circle group of a nilpotent ring of index 3. 


and 


THEOREM |. The nilpotent group G of class 2 is the circle group of a nilpotent ring 
of index 3 if and only if there is a mapping m from the Cartesian product G x G 
into Z such that for all g, h and k in G, 


(i) = m(gh,k) = m(g,k)m(h,k), 
(i) = m(g, hk) = m(g,h)m(g,k), 
(ili) m(m(g,h),k) = m(g,mth,k)) = e, 
Where e denotes the identity element in G, and 
(iv) m(g,h){m(h, g)}~? = [9.4]. 


Proof. if G is the circle group of a ring R, with R® = 0, then it is not dificult 
to verify that the mapping m given by 


mg,h)=gxh 


takes its values in the centre of G and satisfies the conditions (i) to (iv). 
Conversely, if G is a nilpotent group of class 2 and there is a mapping m satis- 
fying the conditions (i) to (iv), then we define 


gth=hgm(g,h) and gxh= m(g,h) 


50 J. C. AULT AND J. F. WATTERS [January 


for all g and h in G. It is straightforward to verify that these definitions make G into 
a ring R which is nilpotent of index 3. The details are left to the reader. 
We mention that condition (iii) implies that 


m(g,e) =e = m(e,g) 
for all g in G. It then follows that the element e of G is the zero element of the ring. 
The negative of an element g of G may be expressed in the form 
—g = g"*{m(g,g~")}~. 


REMARK. Condition (iv) implies that R? 4 0. It is worth noting that, in the 
verification of the ring axioms, condition (iv) is needed only for the commutativity 
of the addition, but it is the one which is the most significant. There are many 
mappings which satisfy (i), (ii) and (iii), (for example, m(g, h) = e or m(g,h)=[g,h]), 
but it is more difficult to find ones which also satisfy (iv). In the next theorem we show 
how to construct such a mapping m when G is finite. 


THEOREM 2. Jf G is a finite nilpotent group of class 2, then G is the circle group of 
a nilpotent ring of index 3. 


Proof. Since Gis finite, the Abelian group G/Z is isomorphic to a direct product 
of a finite number of finite cyclic groups. Thus we can choose a set of independent 
generators Za,,Za,,:::,Za, for G/Z having orders n,,n,---,n,, say. Then, given 
any g in G, there are unique exponents «,(g),«,(g),-::,%,(g) such that 


Zg = (Za) (Za,)? _ (Za,)" 
_ Za f1'%q 29 _— giro 
and 0 <= «,(g) <n; for i = 1,2,---,r. Put 


(2) m(g,h) = I] [a,,a ,]Ou™, 


i<j 


where i,j = 1,2,---,r. Since [a,,a,] is in Z and the exponents are uniquely deter- 
mined, this does define a mapping from G x G into Z. It remains to check conditions 
(1) to (iv) of Theorem 1. 

(1) It is necessary to calculate the exponent «,(gh). Now 


Zgh = (Zg)(Zh) = Za Ore Mg 2209) +2200)... q seg) tart 
so that 

(3) a(gh) = a(g) + «(h)(mod n,). 

Since [a;,a,]"' = [a;",a;] = e we have 


[a,, a [Pa — [ aj, a; | OT a, ag Jas), 


1973] MATHEMATICAL NOTES 51 


Therefore 
m(gh, k) 


I] [ a,, aq ens 


i<j 


IL [a;,a; | ai(g)a,(k) IL [a;,.a, Or? 


i<j i<J 


m(g,k)m(h,k). 


(11) This condition follows in the same way as (i). 
(iii) If z is in Z, then «,(z) = 0 for all i = 1,2,---,r. 
Hence 


m(g,Z) = m(z,k) =e 


for all z in Z and g and k in G. Condition (111) follows since m takes values in Z. 
(iv) Suppose 


_. ai(g), a2(g) ar(g) 
g = 214, "ay a, 
and 
(h\ 
n eo 25a 1M ge) eee qn 


where z, and z. are in Z. Then 


[g,h] 


TI [a a Pa 
. ! J 


isj=1 


I] [a;, a; [ion —a,(h)aj(g) 


i<j 


where i,j = 1,2,-:-,r. Thus 


m(g, h) {m(h,g)}~* 


Il [a,, a, [ou I] [a,,a,]~ 7 


i<J i<y 


=[g,h]. 
This completes the proof of the theorem. 
Can every infinite nilpotent group of class 2 occur as the circle group of a nil- 
potent ring? It is our conjecture that every nilpotent group G (centre Z) of class 2 


is the circle group of a nilpotent ring of index 3, but have not yet been able to prove 


this in general. However, we can show that this conjecture is true in the following 
cases. 


Case 1. The group G/Z is a direct product of cyclic groups. 

The formal definition of the function m is exactly the same as in (2). In the 
case when G/Z has an infinite factor with generator Za,;, say, then «,(g) can be 
any integer and the congruence (3) becomes 

a(gh) = a(g) + a(h). 


In the case when G/Z has infinitely many factors, these factors have to be indexed 


52 P. ERDOS AND R. K. GUY [January 


by a well-ordered index set. Since, for a given element g of G, only finitely many of 
the exponents «,(g) will be non-zero, there will only be finitely many non-identity 
factors in the right-hand side of (2) and so m ts well-defined. 


Case 2. The group G/Z is a torsion group. 

This case is more difficult but may be reduced to the previous one by first de- 
composing G/Z into its p-components and then considering, in each of these compo- 
nents, a basic subgroup, which by definition is a direct product of cyclic groups 
(Fuchs [1, p. 98]). 


Case 3. Every element of Z has a unique square root. 

Here we set m(g, h) = [g,h]* and it is not difficult to verify that this satisfies 
conditions (1) to (iv). The ring so obtained is essentially the same as the one discussed 
by Kaloujnine [2]. 

This case includes the case when Z has odd exponent. 

Whether the conjecture is true in general remains an open question. 


References 


1. L. Fuchs, Abelian Groups, Pergamon, London, 1960. 

2. L. Kaloujnine, Zum Problem der Klassifikation der endlichen metabelschen p-Gruppen, 
Wiss. Z. Humboldt-Univ. Berlin, Math. —Nat. Reihe, 4 (1955) 1-7. 

3. R. L. Kruse, On the circle group of a nilpotent ring, this MONTHLY, 77 (1970) 168-170. 


RESEARCH PROBLEMS 


EDITED BY RICHARD GUY 


In this Department the Monthly presents easily stated research problems dealing with notions 
ordinarily encountered in undergraduate mathematics. Each problem should be accompanied 
by relevant references (if any are known to the author) and by a brief description of known 
partial results. Manuscripts should be sent to Richard Guy, Department of Mathematics, 
Statistics, and Computing Science, The University of Calgary, Calgary 44, Alberta, Canada. 


CROSSING NUMBER PROBLEMS 


P. ErD6és, Hungarian Academy of Science, and R. K. Guy, University of Calgary 


A graph, G(V,E), is a set V of vertices and a subset E of the unordered pairs 
of vertices, called edges. A drawing is a mapping of a graph into a surface. The 
vertices go into distinct points, nodes. An edge and its incident vertices map into a 
homeomorphic image of the closed interval [0,1] with the relevant nodes as end- 


1973] RESEARCH PROBLEMS 53 


points and the interior, an arc, containing no node. A good drawing is one in which 
no two arcs incident with a common node have a common point; and no two arcs 
have more than one point in common. A common point of two arcs is a crossing. 
An optimal drawing in a given surface is one which exhibits the least possible number 
of crossings. Optimal drawings are good. This least number is the crossing number 
of the graph for the surface. We denote the crossing number of G for the plane 
(or sphere) by v(G). 

Almost all questions that one can ask about crossing numbers remain unsolved. 
For the complete graph, K,,, with n vertices and all (5) possible edges, it has been 
conjectured [7] that 


GQ) W(K,) = 430] GB — DIB -2)]b@ -3)], 
where brackets denote greatest integer not greater than. For n < 10, this has been 
verified [10]: 

n 2 3 4 5 6 7 8 9 10 


WK,) | 0 0 0 1 3 9 18 36 60 


Blazek and Koman [1] and others [e.g., 7, 12] have given constructions which show 
that (1) is an upper bound. Kleitman’s result [15, and see below] for the complete 
bipartite graph implies that for n sufficiently large, 


(2) WK) 2 gan(n—1)(n-2)(n-3). 


This is a little better than the lower bound given in [9]. It is easy to see that v(K,)/n* 


is non-decreasing and so tends to a limit (between ; and 2). A counting argu- 


ment shows that if (1) is true for n odd, then it is also true for n + 1. Eggleton and 


Fic. 1 


54 P. ERDOS AND R. K. GUY [January 


FIG. 3 


Guy [3] have also shown that for n odd, v(K,) and (4) have the same parity. Call 
two drawings isomorphic when there is a one-to-one correspondence between the 
nodes so that if any pair of arcs crosses, the corresponding pair also crosses. Optimal 
drawings of K, for n = 5, 6, 7, 8 are shown in Figures 1, 2, 3, 4. For n = 5,6 these 
are unique, but for n = 7 there are five which are non-isomorphic and forn = 8 
there are three [10]. For n = 9 the number is about 200. 

An attempt to put the theory of crossing numbers into algebraic form has been 
made by Tutte [20]. 


Fic. 4 


1973] RESEARCH PROBLEMS 55 


If the arcs are restricted to be straight line-segments, we have the concept of 
rectilinear crossing number, 1(G), of a graph G. It is clear that 1(G) = v(G). A 
theorem of Fary [6, 19] may be stated: if a graph can be embedded in the plane, 
then it can be so drawn using straight line segments. Hence v(G) = 0 implies 1(G) = 0. 
For n <7 and n= 9, v(K,) = V(K,). (Figure 3 can be realized with straight line 
segments.) But Guy [10] has confirmed a conjecture of Harary and Hill [13] that 
¥(K,) = 19, in contrast to v(K,) = 18. It can also be shown that ¥(K,) > v(K,) 
for n 2 10. It is conjectured that 3(K,,) = 63. Jensen [14] and independently 
Eggleton have shown that 


(3) 1(K,,) & [(7n* — 56n? +.128n? + 48n[(n—7)/3] + 108)/432] 


and equality is conjectured. The fact that #(K,) = 1 gives an immediate proof of 
Esther Klein’s result [5] that five points in the plane always include a convex quadri- 
lateral. More generally, there is an exact correspondence between rectilinear cross- 
ings and convex quadrilaterals, so the problem of determining the rectilinear 
crossing number for the complete graph can be restated in the form: what is the 
least number of convex quadrilaterals determined by n points in the plane? More 
generally, one can ask for the least number of convex k-gons determined by n points 
in the plane, for k > 4. As before, the ratio of this number to (7) tends to a 
positive limit as n tends to infinity with k fixed. 

The crossing number problem for the complete bipartite graph, K,,,,, on m+n 
vertices, whose mn edges are just those which join one of the m vertices to one of the n, 
first appeared as Turan’s brick-factory problem. For some years it was thought 
that Zarankiewicz [22] and Urbanik [21] had solved this, but a hiatus in the proof 
was found independently by Ringel and Kainen [see 8] and the formula 


(4) (2) WKan) = [4m] [30m—-1] [3n] B-)] 


is still conjectural. It was established for min(m,n) = 3 by Zarankiewicz and a 
counting argument again gives the result for each even number if it is known for 
the preceding odd one. The best result is due to Kleitman [15] who established (4) 
for min(m,n) < 6. The corresponding rectilinear problem may have the same solu- 
tion (4), since Zarankiewicz’s construction uses only straight arcs (Figure 5). 

For the i-skeleton of the n-cube, Q,, whose vertices, the 2” binary n-tuples, 
are joined by an edge just if their vectors differ in exactly one component, Eggleton 
and Guy [4] announced that 


(5) ev) s aa" - [FA], 


but a gap has been found in the description of the construction, so this must also 
remain a conjecture. We again conjecture equality in (5). 


56 P. ERDOS AND R. K. GUY [January 


(2) W(K7 7) = W(K,7) = 81. 
Fic. 5 


More generally, let G(m,k) be a graph with n vertices and k edges. Denote by 
g(n, k) the minimum of v(G) taken over all graphs G(n, k). Then we conjecture that 


3 3 
(6) M 8 cagink < Ee; 
n n 

in fact, that if k/n > co, then limg(n,k)/(k?/n?) exists. From Euler’s theorem, 
g(n, 3n—6) = 0, g(n,3n—5) = 1. The upper bound in (6) is trivial (with c, = 1/8), 
for, let | be the least integer with /n > 2k and consider n/I copies of K,;. The lower 
bound would follow if we could prove that every drawing of a G(n, k) contains an arc 
with at least c,k?/n* crossings. In this connexion we can ask the following question: 
determine or estimate the smallest integer f(7) so that every drawing of a graph 
G(n,f(r)) contains an arc with at least r crossings. Euler’s theorem implies that 
f(1) = 3n — 5 and Eggleton and Guy [3] have shown that f(2) = 4n — 8 for n = 6,7 
and 9, and 4n —7 for n = 8 or n 2 10. This implies that 


g(njk) = k-—3n+6 for 3n-685kS8 min (4n—8,(>)), 


except that g(7,20) = 6 and g(9,28) = 8. But /(3) has not yet been determined. 
Another related question is: which graphs G(n,k) have maximal v(G) and what 


1973] RESEARCH PROBLEMS 57 


is this maximum? We conjecture that the following graph has maximal v(G): take | 


so that 
J+ 1 
() s&<( 2 ) 


and the graph consists of K; with a vertex joined to k — (5) of its vertices (andn—/—1 
isolated points). 


These more general problems can also be posed in the rectilinear case. We can 
also ask analogous questions for surfaces of higher genus; some results have been 
obtained for the torus [11, 12], and for the projective plane and Klein bottle [16]. 

We are indebted to R. B. Eggleton for helpful discussions and suggestions, and 
permission to reproduce his results. 


References 


1. J. BlaZek and M. Koman, A minimal problem concerning complete plane graphs, in M. 
Fiedler (ed.), Theory of Graphs and its Applications, Proc. Symp. Smolenice, 1963; Prague, 1964, 
113-117; MR 30(1965) #44249. 

2. , and , On an extremal problem concerning graphs, Comm. Math. Univ. Caroli- 
nae, 8(1967) 49-52; MR 35(1968) #41506. 

3. R. B. Eggleton, Ph.D. thesis, Univ. of Calgary, 1973. 

4. R. B. Eggleton and R. K. Guy, The crossing number of the n-cube, Amer. Math. Soc. No- 
tices, 17(1970) 757. 

5. P. Erdés and G. Szekeres, A combinatorial problem in geometry, Compositio Math., 2 
(1935) 463-470. 

6. I. Fary, On straight line representation of planar graphs, Acta Sci. Math. (Szeged), 11 (1948) 
229-233; MR 10 (1949) 136. 

7. R. K. Guy, A combinatorial problem, Nabla (Bull. Malayan Math. Soc.) 7 (1960) 68-72. 

8. , The decline and fall of Zarankiewicz’s theorem, in F. Harary (ed.), Proof Techniques 
in Graph Theory, Academic Press, N. Y., 1969, 63-69. 

9, , sequences associated with a problem of Turan and other problems, Proc. Balatonfiired 
Combinatorics Conf., 1969. Bolyai Janos Matematikai Tarsultat, Budapest, 1970, 553-569. 

10. , Latest results on crossing numbers, in Recent Trends in Graph Theory, Springer, 
N.Y., 1971, 143-156. 

11. R. K. Guy and T. A. Jenkyns, The toroidal crossing number of Kyy,,, J. Combinatorial 
Theory, 6 (1969) 235-250; MR 38(1969) #45660. 

12. R. K. Guy, T. A. Jenkyns and J. Schaer., The toroidal crossing number of the complete 
graph, J. Combinatorial Theory, 4(1968) 376-390, MR 36 (1968) 43682. 

13. F. Harary and A. Hill, On the number of crossings in a complete graph, Proc. Edinburgh 
Math. Soc. (2), 13 (1962-3) 333-338; MR 29 (1965) #602. 

14. H. F. Jensen, An upper bound for the rectilinear crossing number of the complete graph, J. 
Combinatorial Theory, 10B (1971) 212-216. 

15. D. J. Kleitman, The crossing number of Ks ,,, J. Combinatorial Theory, 9 (1970) 315-323. 

16. M. Koman, On the crossing numbers of graphs, Acta, Univ. Carolinae Math. Phys., 10 
(1969) 9-46. 

17. K. Kuratowski, Sur le probléme des courbes gauches en topologie, Fund. Math., 15 (1930) 
271-283. 


58 M. F. RUCHTE AND R. W. RYDEN [January 


18. J. W. Moon, On the distribution of crossings in random complete graphs, J. Soc. Indust. 
App. Math., 13 (1965) 506-510; MR 31 (1966) #3357. 

19, W. T. Tutte, How to draw a graph, Proc. London Math. Soc. (3), 13 (1963) 743-767; MR 
28 (1964) #1610. 

20. , Towards a theory of crossing numbers, J. Combinatorial Theory, 8 (1970) 45-53. 

21. K. Urbanik, Solution du probléme posé par P. Turan, Collog. Math., 3 (1955) 200-201. 

22. K. Zarankiewicz, On a problem of P. Turan concerning graphs, Fund. Math., 41 (1954) 
137-145; MR 16 (1955) 156. 


CLASSROOM NOTES 
EDITED BY ROBERT GILMER 


Manuscripts for this Department should be sent to Robert Gilmer, Department of Mathematics, 
Florida State University, Tallahassee, FL 32306. Notes are usually limited to three printed pages. 


A PROOF OF UNIQUENESS OF FACTORIZATION 
IN THE GAUSSIAN INTEGERS 


M. F. RUCHTE AND R. W. RYDEN, Humboldt State College 


Let K(i) denote the Gaussian Integers, K(i) = {a + bi | a, b are rational integers}. 
It is well known that K(i) has the unique factorization property. Normally, one 
shows that K(i) is a Euclidean domain and then uses the fact that every Euclidean 
domain is a unique factorization domain. We give a direct proof that factorization 
is unique in K(i) which parallels the proof for the rational integers as given in Niven 
and Zuckerman (p. 15). We would like to express our appreciation to Professor 
Ivan Niven for having raised for us the question of the existence of this type of proof. 


LEMMA 1. If z and w are two non-zero complex numbers such that | w| < | Zz 
and | argz — arg w| < 17/3, then lz — w| < | 2 , 


Proof. The triangle formed by the points 0, z, w in the complex plane has an 
angle less than z/3 at the origin, so the side opposite, which is of length | Z—wW , 
cannot be the longest side. Further, since | w| < | z| , we conclude that | Zz —w| < | Zz . 

If z is a complex number the associates of z are the numbers z, —z, iz, —iz. 


LEMMA 2. If z and w are complex numbers then there exists an associate w' 
of w such that | argz — arg w’| < 7/3. 


Proof. The associates of w are at right angles to one another; therefore, there 
must be one of them in any given sector of angle 27/3. 

If «e K(i), « = a + bi, denote by N(«), the norm of «, the non-negative rational 
integer a? + b*. Note that (1) N(aB) = N(a): N(B), (2) if ¢ is a unit (e = 1, —1, i, 
or —i) then N(e) = 1, and (3) N(a) = |a|?. 


1973] CLASSROOM NOTES 59 


Two factorizations « - B, 4: 6 are the same up to associates if « and f are respec- 
tively associates of the w and 6 in either order and similarly for factorizations into 
more than two factors. 


THEOREM. Factorization in K(i) is unique up to associates. 


Proof. Suppose that K(i) does not have unique factorization and that « is a smallest 
number in norm which admits two representations, 


O=6,°°0,= pr, p, (r,5>1), 


where no g; is an associate of any p;. Without loss of generality we may assume 
that lo, | = | P| and, since the factorization can be changed by taking associates 
of the involved primes, that | argo, — arg p1| < 2/3 by Lemma 2. 

Let B = (0, — p1)0203°--G,, 


N(B) = N(o, — p1)N(G2)-+- N(G,) < N(o1)N(o2)-+- N(a,) = N(a). 
But 


Bp = 0102°++0, — P1402" G, 
= P1°** Ps ~ P192°** Or 
= PilP2"* Ps — O2°7* F,)- 


So that 8 admits two representations, one having p, as a factor and one having no 
associate of p, as a factor; a contradiction of the minimality of N(a). 

Notice that geometric arguments can be applied to all complex quadratic fields. 
Lemma 2, however, relies on the existence of four associates at right angles. For the 
case K(./—3) we also have enough units (see Niven and Zuckerman, p. 210) to 
yield the conclusion of Lemma 2 so that unique factorization for K(/ —3) follows 
in the same fashion. However, all other K(./ m) where m is negative, have only two 
units, +1, so that this particular proof cannot be extended to the case of other 
negative m. 


Reference 


1. I. Niven and H. Zuckerman, An Introduction to the Theory of Numbers, 2nd ed., Wiley, New 
York, 1966. 


SOME HALF-PLANE DIRICHLET PROBLEMS: A BARE HANDS APPROACH 
F. J. FLANIGAN, University of California, San Diego 


1. We suggest that students in a basic complex variables course might benefit 
from seeing, early in the course, that analytic functions give ready-made solutions 
to boundary-value problems, and we offer a class of attractive Dirichlet problems 
for the upper half plane to illustrate this fact. These problems require only (a) the 


60 F. J. FLANIGAN [January 


theorem that the real and imaginary parts of a complex analytic function are har- 
monic, and (b) a simple version of the partial fractions decomposition ofa rational 
function, as seen in calculus. This project could be arranged in exercise form, soon 
after the presentation of the Cauchy-Riemann equations. The student will be delighted 
to discover that he does not have to know here ‘‘how to solve partial differential 
equations,’’ for this is taken care of by (a) above. 


2. The problem: Given the ‘“‘boundary values’’ B(x), a real-valued rational 
function defined everywhere on the x-axis, to find a real-valued function H(x, y) 
continuous on the closed upper half plane y 2 0 and harmonic on the open upper 
half plane y >0 such that H(x,0) = B(x). Note therefore that B(x) = P(x)/Q(x) 
where P and Q are real polynomials with no common roots, and Q(x) has no real 
roots. 

Our method is always to find the ‘‘correct’’ complex analytic function and use 
its real part as the solution H(x, y). 


3. First case: B(x) is a polynomial in x. Here H(x, y) = Re B(x + iy) is a solu- 
tion. (We do not discuss uniqueness.) 

However, if B(x) = 1/(x? + 1), then Re B(x + iy) is not a solution, because of 
the pole at z = i. The solution in this case is, in fact, 


i 1 y+] 
H(x,y) = 2Re|~- ——] = = —__.... 
Oy) Rel z+ | x? + (y + 1)? 


This comes from the following considerations. 


4. We consider only B(x) = P(x)/Q(x) where the roots of Q(x) are non-real 
and simple. 


LEMMA. Given B(x) as above, there is a decomposition 
B(x) = B*(x) + B(x), 


where B*,B™ are rational functions of x with complex coefficients such that 

(i) the values B*(x),B (x) are conjugate complex numbers, 

(ii) the complexified functions B*(z), B~(z) have their poles concentrated in 
the lower half plane and upper half plane, respectively. 


Proof (outline). (a) As remarked above, we assume that Q(x) has roots z,,--:,Z 
(open upper half plane) and Z,,---,Z,, all distinct. 
(b) We write down a formal partial fraction decomposition 


1/Q(x) = X, [a,/(x —2,)]+ &, [b;/~ -Z,)] 
and argue that equality holds provided 


a; = [(z — 2,)/Q@)].=2,5 6; = Lz — 2)/O@].=:,. 


nt 


1973} CLASSROOM NOTES 61 


One uses here the hypothesis that the roots of Q(z) are distinct. 
(c) One next observes b, = G,. 
(d) But since B(x) = P(x)/Q(x), we have 


B(x) = P(x) 2 ;[a,/(x —z)] + PQ) 2 ;[4@/( —- Z,)]. 


We name the two terms separated by the + sign here B-(x), Bt(x) and verify 
immediately that they satisfy (i) and (ii). Done. 


Just as in Section 3, we may now write down a solution to the Dirichlet problem, 
namely 


H(x, y) = 2ReB"(x + iy). 


The author acknowledges support from the National Science Foundation through NSF GP- 
23104. 


SINGLE LAYER POTENTIALS AND THE CAUCHY-KOWALEWSKI THEOREM 
P. A. NICKEL, North Carolina State University 


Single and double layer potentials have played a very important role in mathe- 
matical physics [1] for a long time and an important problem related to these is the 
determination of the jumps in the potentials themselves, as well as the jumps in the 
associated normal and tangential forces. In standard treatments of these phenomena 
such as [2] and [3], arguments of an advanced calculus sort are used, and these 
become extremely delicate. 

In his book on partial differential equations [4], Garabedian makes a convincing 
case for the use of the Cauchy-Kowalewski Theorem in discussing jump phenomena 
in these problems. The reason for this is quite simple; namely, when the path of 
integration is deformed and an equivalent distribution defined, the integrands 
become regular. As a result, the arguments of advanced calculus are routine, at least 
when the boundary curve 0@ and distributions are analytic. 


1. Purpose of this note. In the determination of the jump in the normal force for 
a single layer potential in[4], an argument is made concerning the symmetry of the 
contributions from two sections of a certain spherical surface in the limit as the 
radius of the sections goes to 0. Even though this phrase is reasonable enough, it 
would seem desirable to establish the jump relations from the inside and outside 
of Z independently of one another, as was done in the discussion of continuity of 
a single layer potential, as well as in the discussion of continuity of the normal 
force for a double layer. The purpose then, of this note is the determination of the 
normal force for a single layer » at points of 0D inside Z without recourse to the 
argument mentioned above concerning symmetry in the limit. The same can then 


62 P. A, NICKEL [January 


be accomplished for points on 0Y outside Y, and the jump is simply the difference 
of these forces. 


2. Notation. The presentation here is made in two dimensions, at a point pp) on 
the analytic boundary of the simply connected region Y. The normal force at 
Po €O0@ inside of @ is 


lim (éV~-/én) = (éV—/én),,. 


In particular, our task is then to develop the relation (9.46) of [4] 


oVv- 
(1) ah 


ds, 


Po 


4) 
= —TL(Po) + | H(s) = logr 
Po CD n 
where (0/0n) refers to differentiation in x and y in the direction of the outward 
normal and r = ,/(x—€,)? + (y—n,)?, the distance from (x, y) to the point (£,,7,) 
on 0M, all in terms of arc length s. 


For convenience, we select po as the origin and take the x-axis in the direction 
of the tangent to 0D at py. Hence there is a disc Ag (po) of radius R, and center 
at po inside of which ¢@ is described by y = f(x), with f(0) = f'(0) = 0. Further, 
with an application of the Cauchy-Kowalewski Theorem as in [4, p. 336], there is 
another disc, say Ag,(po) inside of which there is a harmonic function u(z) such 
that on Apg,(po) 1 0D 


(2) u=0, ~— = UL. 


Hence we can employ the conditions (2) as well as the description y = f(x) of ¢2 
inside of Ar(po) where R = min(R,,R,). Evidently the boundary dA,(p,) meets 
0@ in exactly two points, denoted z, and z,, with arguments taken as x — 0, and 
2x + 0,. (The figure illustrates 0, and @, as positive, but this condition is not vital 
to the argument.) 

Applying Green’s identity to the region bounded by T, and T,, along with (2), 


1973] CLASSROOM NOTES 63 


we convert the integral for V from an integral along T, + (2 —T, to the integral 


4) 
ds — | us log | ds. 
(fs Ns) T2 dv (oss) 


Here, (¢/dv) again represents differentiation in the direction of the outward normal, 
but in the dummy variables (¢,7),andT, = {z; | z| = Rand arg z, S$ arg z S argz,}. 


(3) v= | ulogrds + | OH oe 
3 r, ov 


Q-T, 


3. The normal force. Since the path of integration no longer passes through po, 
the normal differentiations are performed before integration rather than after, and 
the task of establishing (1) is that of determining J, and J,, the partial derivatives 
of the second and third integrals of (3) in the coordinate system with origin at ipo. 
If on T,, €, and y, are written in polar coordinates (p,0), we find in terms of 6, 
and @,, 


ol 


2n+62 ou FA _ —_—_-_— 
iy = =-log ,/(x — Rcos@)? +(y— Rsin@)?|_  Rdo 


-e, Op oy (0,0) 


207 +802 ou 
-| Op p=R 
n—-@6, 


2m+62 
~ | (u,(0,0)cos 6 + u,(0,0)sin 0)sin 6 dé + Ip, 


—0; 


(0 0) 


sin 6 dé 


(4) 


Hl 


where J, is an integral which goes to 0 with R. But 0, and 0, go to 0 with R as well, 
for tan@, = y,/x, = 4x,f"(%), with 0< ¥<x,. But f” is bounded and i follows 
that tan 0, > 0 as x, > 0. But as R > 0, certainly x, > 0. Hence @,, it elf selec- 
ted in the range (—4z2,4z), must go to 0 as well. Recalling that u,(0,0) = 0 by vir- 
tue of (2), and letting R > 0, we find J,(0,0) = —(n/2)u,(0,0) = (2/2)u(so). 

The last term of (3) is handled in the same way: 


} } oo 
= = | u— lo x—-€,)? + (y—n,)? ds 
oo det g ./( (y—1,) 


aJ 
dy 


(0,0) 


(5) 


22+642 do 2n+0> 
[. u sind — = [. (u,(0, 0)cos 0 +u,(0, 0) sin@)sin 6dé + O(R) 
and, in the limit J,(0,0) is just —(7/2)u(so). The equation (1) now follows from (3) 
and the relation (0/0n),, = —(0/@y)(o,9) and our purpose is now achieved. 

As a further observation, we see that the force (@V+/dn),,, represented by re- 
placing the path ', by I}, is in the limit as R —> 0 just the negative of the corres- 
ponding contribution to (@V~/én),, from I’,. For the only difference in the analysis 
is that the normal derivatives on I, will be —(0/ép), and the non-symmetric con- 
tributions are O(R). Hence, (¢Vt/dn),, and (dV —/dn),,, except for a difference in 
sign, can differ only by O(R); that is, we are back to the statement of [4, p. 339] 
that ‘‘these contributions can ditfer only in sign in the limit as R > 0.”’ 


64 RICHARD CLEVELAND [January 


Acknowledgement. 


The author expresses his thanks to the referee for his suggestions and, in particular, for his 
questioning one very important sign. 


References 


Julius A. Stratton, Electromagnetic Theory, McGraw-Hill, New York, 1941. 
O. D. Kellogg, Foundations of Potential Theory, Ungar, New York, 1946. 

I. G. Petrovskii, Partial Differential Equations, Saunders, Philadelphia, 1967. 
P. R. 


1. 
2. 
3. 
4, Garabedian, Partial Differential Equations, Wiley, New York, 1964. 


A GLOBAL CHARACTERIZATION OF UNIFORM CONTINUITY 
RICHARD CLEVELAND, Sacramento State College 


Let f be a function on a metric space (X,d) into a metric space (Y, D). It is well 
known that fis uniformly continuous on X if and only if for any two sequences 
{x,} and {y,} in X, 


lim, d(X,,¥,) = 0 implies lim, D(/(x,),fQ,)) = 9, 


[e.g., [1], p. 168]. From this one easily obtains the following global property of 
uniform continuity. 


THEOREM 1. If f is uniformly continuous on X and A and B are non-empty 
subsets of X with d(A,B) = 0, then D(f[A],f[B]) =0. 


The converse of this theorem is also true; the property of preserving zero distance 
between sets characterizes uniform continuity. This fact was first announced by 
Yu. M. Smirnov in [3]. It later appeared in a paper by H. Kenyon [2]. Both of 
these papers give the result in the setting of general uniform spaces. It is hoped that 
this note will make it more accessible to students and teachers. 


THEOREM 2. If for every pair of non-empty subsets A and B of X 
(1) d(A, B) = 0 implies D(fLA],f[B]) = 0, 
then f is uniformly continuous on X. 


Proof. Suppose (1) holds, but fis not uniformly continuous on X. Then choose 
e>0 so that 


(2) for every 6>0 there are x and y in X such that d(x,y)<6 and 
D(f(x), f(y) 2 3e. 
We keep é« fixed for the rest of the proof. For any ze X, let 


S(z) = {se X: D(f(s),f(2)) <8} 


1973] CLASSROOM NOTES 65 


and 


IV 

bo 

SP) 
a 


T(z) = {teX: D(f(),f()) 2 
Notice that for any z, if T(z) ¥ @, 


D(f{S(z)], f(T) 2 e, 
so that by (1), 
d(S(z), T(z)) > 0. 


We choose inductively a sequence {6,} of positive numbers and sequences {x,} 
and {y,} in X as follows: take 6, = 1 and choose x, and y, by (2) so that 


A(X1,¥1) <6, and D(f(x;),f(1)) 2 3e. 
Suppose 6,, X,, and y, have been chosen. Then take 6,,, so that 
0 < On41 
< min{46,, d(S(x,), T&,)), (SC), TO) 
and choose x,,, and y,4, in X so that 


A(Xn+1>Vn41) < On 
and 


Df (Xn 4 1)>f Vn+1)) = 36. 


Now let A = {x,,X2,---} and B= {y,, y,---}. By construction, lim,6, = 0, so 
d(A, B) = 0. The proof will be complete when we show that 


D(f[A],f[B]) > 9. 


To this end, suppose m and n are integers and m<n. Then d(x,, y,) <6, and 
D(f(x,),f(,)) 2 3e. Suppose D(f(x,,),f(¥m)) < &. This means that x, € S(y,,). But 
since d(S(Vn)> T Vin) > On > A(X: ¥,), it follows that y, ¢ T(y,,), OF 


Df Vi)».f Vm) < 28. 
But then we obtain the contradiction 
Df (Xn) SOW) SF DF) f Om) + DAO Om) < 38. 


Therefore, 


D(f(X,),F Om) = é. 


By a similar argument we obtain the same inequality for m > n, and since we already 
have it for m = n, we have shown that 


D(f[A],f[B) 2 «. 


and the proof is complete. 


66 RICHARD CLEVELAND 


It is clear from the proof that the hypothesis of this theorem can be weakened 
to the assumption that (1) holds for every pair of countable subsets of X. 
To see how this theorem can be used as a test for uniform continuity, consider 


the following applications: 


THEOREM 3. If a sequence of uniformly continuous functions converges uni- 
formly on X, then the limit function is uniformly continuous. 


Proof. Suppose f, — g uniformly on X, where each f, is uniformly continuous 
on X. Let A and B be non-empty subsets of X with d(A,B) = 0. Let e>0 and 
choose n so large that 


D(f,(t), g(t) < €/3, 
for all te X. Then choose xe A and ye B so that 
Df), InY)) < 8/3. 
Then it follows that 
D(g(x), 8(Y)) S D(g(x), fn) + D(X). SY) + DAY), 80) < €, 
and we conclude that D(g[A], g[B]) = 0. 


THEOREM 4. If X is compact and f is continuous on X, then f is uniformly 
continuous on X. 
Proof. Suppose A and B are non-empty subsets of X such that d(A,B) = 0. 


Because X is compact, this implies that A- 1 B- #~ @. Also, because f is conti- 
nuous, f[S-] <f[S] for any S cq X. Combining these two facts, we get 


ODALLA |ofL[B ]cf[A] af[By, 
so that D(f[A],f[B]) = 0. 


References 


1. R. G. Bartle, Elements of Real Analysis, New York, 1964. 
2. H. Kenyon, Two theorems on relations, Trans. Amer. Math. Soc., 107 (3) (1963) 1-14. 
3. Yu. M. Smirnov, On proximity spaces, Mat. Sb., (N. S.) 31 (73) (1952) 543-574. 


1973] MATHEMATICAL EDUCATION 73 
A LETTER BY PROFESSOR POLYA 


The following letter was written recently by Professor Polya to a Department 
Chairman. It is reproduced here with Professor Polya’s permission, but with the 
omission of any other identifying names. 


‘‘Dear Colleague: 
‘“As you may know, | am especially concerned with problem-solving; I wrote 


books about it and I stressed it in my teaching, especially in teaching high school 
mathematics teachers. That the role of problem-solving in mathematics is not 
understood by non-mathematicians and is not duly appreciated by outsiders, is 
not surprising and we need not worry about it. But I heard lately that such lack 
of understanding and appreciation led to denying the promotion to a member 
of your Department. I feel that there is a serious matter of principle involved, 
and I wish to write you about it. 

‘Problems play an essential role both in the progress and in the teaching of 
science. I cannot develop properly this topic in this letter — it would need two 
volumes, one on history and methodology, another on pedagogy. Yet let us 
come nearer to the particular case we are concerned with. 

‘Problems play an important role on all levels of mathematical instruction. 
It is by solving problems that the students learn to understand, to apply and to 
appreciate the material presented in the course, and the instructor judges the 
performance of the students on the basis of their problem-solving. More advanced 
students may do some other kind of work (€. g., reports in a seminar) but problems 
are the backbone of undergraduate instruction. 

‘“‘Demands on the teacher are different on different levels. 

‘‘A faculty member who teaches mainly graduate students must prepare 
them for research and so he has the duty to keep contact with contemporary 
research and cannot let his own research get rusty. Does he do his duty? How 
can we judge it? Most directly on the basis of his publications. It is well known 
that the ‘principle’ of ‘publish or perish’ was unwisely applied in several cases, 
yet some rule in this direction is necessary to judge the instructors of advanced 
students. 

‘A faculty member who teaches mainly undergraduate students should have, 
of course, a good mathematical background and he should not let it get rusty. 
Yet to extend to his case the ‘principle’ of ‘publish or perish’ is unwise and unjust. 
Under stress — and just for prestige, without real love or interest, the faculty 
member finally produces a paper that is printed and immediately submerged, 
unread and unnoticed, in the ocean of the present overproduction —is not 
such an effort misguided? Another way of not getting rusty is to pose and solve 
problems — and it is, in my opinion, in many cases a better way: Problem- 


74 ELEMENTARY PROBLEMS AND SOLUTIONS [January 


solving is a perfectly acceptable and respectable professional activity for a mathe- 
matician and can favorably influence his teaching. The problem section of the 
American Mathematical Monthly, e.g., contains some quite difficult problems, 
has very good editorial staff, and appeals to a good number of problem-solving 
mathematicians, some of whom are quite enthusiastic. 

“If it is true what I heard that your colleague’s promotion was refused, 
because he ‘only’ solved problems and did not publish, such a decision is unwise 
and unjust. 

“Sincerely yours 
(Signed by) 
**GEORGE POLYA 
Professor Emeritus, Stanford University”’ 


PROBLEMS AND SOLUTIONS 
EDITED By Emory P. STARKE 


ASSOCIATE EDITORS: JOSHUA BARLAZ, ERIC S. LANGFORD. COLLABORATING EDITORS: LEONARD 
CARLITZ, GULBANK D. CHAKERIAN, HASKELL COHEN, S. ASHBY FOOTE, ISRAEL N. HERSTEIN, 
MurRa&yY S. KLAMKIN, DANIEL J. KLEITMAN, ROGER C. LYNDON, MARVIN MARCUS, CHRISTOPH 
NEUGEBAUER, ALBERT WILANSKY, AND UNIVERSITY OF MAINE PROBLEMS GROUP: GEORGE S. 
CUNNINGHAM, CLAYTON W. DopGE, HowarD W. EVES, WILLIAM R. GEIGER, GARY HAGGARD, 
PHILIP M. LOCKE, JOHN C. MAIRHUBER, CURTIS S. MORSE, GRATTAN P. MURPHY, EDWARD 
S. NORTHAM AND WILLIAM L. SOULE, JR. 


All problems (both elementary and advanced) proposed for inclusion in this Department should 
be sent to E. P. Starke, 1000 Kensington Ave., Plainfield, NJ 07060. Proposers of problems are 
urged to enclose any solutions or information that will assist the editors. Ordinarily, problems 
in well-known textbooks and results in generally accessible sources are not appropriate for this 
Department. No solutions (except those accompanying proposals) should be sent to Professor 


Starke, 


ELEMENTARY PROBLEMS 


Solutions of Elementary Problems should be sent to Problems Group, Mathematics Department, 
University of Maine, Orono, ME 04473.To facilitate their consideration, solutions of Elemen- 
tary Problems in this issue should be typed (with double spacing) and should be mailed before 
April 30, 1973. Contributors (in the United States) who desire acknowledgment of receipt 
of their solutions are asked to enclose self-addressed stamped postcards. 


E 2391. Proposed by V. R. R. Uppuluri, Oak Ridge National Laboratory 


It is well known that three chords can divide a circular disk into at most seven 
pieces. Can these seven pieces all have the same area? 


1973] ELEMENTARY PROBLEMS AND SOLUTIONS 75 


E 2392. Proposed by David Singmaster, Polytechnic of the South Branch, 
London, England 


On the n x n chessboard, for n = 4, define the knight's distance D(A, B) be- 
tween the squares A and B to be the minimum number of knight’s moves required 
to go from A to B. Define the knight’s diameter M(n) of the n x n board to be the 
maximum knight’s distance between any two squares on the board. 

1. Is M(n) monotonic? 

2. Does M(n) always equal the knight’s distance between opposite corners 

of the board? 

3. Prove or disprove: For n => 5, M(n) = [2n/3]. 

4. Determine the knight’s distance D(0,P) from the origin to an arbitrary 

square P = (a,b) on the infinite chessboard. 


E 2393. Proposed by M.S. Klamkin, Ford Motor Company 


Parallel lines are drawn through the vertices Ay, A,,:::,A, of a given simplex 
of volume V, terminating in the opposite faces (extended if necessary) in the points 
Bo, B,,°::,B,, respectively. 

(1) Show that the volume of the simplex determined by Bo, B,,---,B, is nV. 

(2) Show that the volume of the simplex determined by the vertices 
Ao, A1,°1°, Ap» Bes 1, Bp4,°°', B, is given by V,’ = \n —r- | V. 


E 2394. Proposed by S. L. Greitzer, Rutgers University, and M. S. Klamkin, 
Ford Motor Company 


A line is drawn through the centroid G of a simplex Ao, A,,---,A, intersecting 
the faces (extended if necessary) in the points Bo, B,.-:-,B,, respectively. Show that 


” it 
= 
> GB, % 


where GB; denotes the directed distance from G to B;. Show also that the above 
property characterizes the point G as the centroid; i.e., if the above sum vanishes 
for all arbitrary lines, then G is the centroid. 

This generalizes known results for triangles and tetrahedrons. 


E2395. Proposed by H. W. Gould, West Virginia University 


Let n be a nonnegative integer. For p = 1,2,-:- define 


air EO) (e ay 


where we make the usual conventions regarding binomial coefficients. Prove that, 
whenever n is odd, A,(n) = nA,(n). 


76 ELEMENTARY PROBLEMS AND SOLUTIONS [January 


E 2396. Proposed by Erwin Just, Bronx Community College 
(A) Prove that 2? + 3? is not a perfect power if p is prime. 
(B) Find all natural numbers m and n such that 2” + 3” is a perfect square. 


SOLUTIONS OF ELEMENTARY PROBLEMS 
Enumeration yia the Chinese Remainder Theorem 


E 2330 [1971, 1138]. Proposed by Richard Stanley, Massachusetts Institute 
of Technology 


Let f be a function from the positive integers to the integers satisfying 
f(m +n) = f(n) (mod m) for all m, n 2 1 (e.g., a polynomial with integer coef- 
ficients). Let g(n) be the number of values (including repetitions) of f(1), f(2), --- ,f(n) 
divisible by n, and let h(n) be the number of these values relatively prime to n. 
Show that g and h are multiplicative functions of n related by 


hin) = & wlddg(d)(nfa) = n Py (1 - SP), 
d|n pin Pp 

Solution by Stephen Spindler, University of Chicago. Given (m,n) = 1 and 
1<sasxm,1sb <n, it follows from the Chinese Remainder Theorem and the 
properties of f that m | f(a) and n | f(b) if and only if mn | f(x) where x = x(a, b) 
is that unique integer such that x = a(modm), x = b (modn), andi Sx Smn. 
Thus g is multiplicative. For d | n, the number of values of f(1),---,f(n) divisible 

by d is just (n/d)g(d); by a straightforward inclusion-exclusion count, 


h(n) =n— & (n/p)g(p) + & (n/pp')g(pp’)— + -::, 


the first sum being over all primes p such that P| n, the second being over all pairs 
of distinct primes p,p’ such that pp’ | n, etc. Thus 


h(n) =n] ( - 2), 


as desired. 


Also solved by Arnold Adelberg, S. J. Benkoski, Suzette M. Cormier, Neal Felsinger, J. L. Good- 
ling, M. G. Greening (Australia), Emil Grosswald, H. S. Hahn, J. L. Hunsucker & Jack Nebb, 
Wells Johnson, David Kelly, J. F. Marcotorchino (France), L. E. Mattics, Kenneth Rosen, Edward 
Rosenthall, Temple University Problem Solving Group, and the proposer. 


Editor’s Comment: Benkoski calls attention to Harlan Stevens, Generalizations of the Euler 9- 
function, Duke Math. J., 38 (1971), 181-186, and shows how minor modifications of the results of this 
paper can be used to solve the present problem. The proposer notes that the condition f(m-+n) =f (n) 
(mod m) can be replaced by the more general condition that m divides f(n) if and only if m divides 


f(m + n). 


1973] ELEMENTARY PROBLEMS AND SOLUTIONS 77 


Divers Diverse Diophantine Determinations 


E 2332 [1972, 87]. Proposed by R.S. Luthar, University of Wisconsin at 
Janesville 
Find all solutions in positive integers: 


ye + 4y = 2’. 


I. Solution by G. B. Robinson, State University of New York at New Paltz. 
Let y = nk* with n square-free. Since k?| z*, it follows that z = mk, reducing the 
equation to n(y? + 4) = m?. Since n is square-free, we have that n|(y? +4). But 
n| y, so that n| 4, and hence n = | orn = 2. For n = 1, we have y* + 4 = m?, which 
clearly has no positive integral solution. Letting n= 2, we get &k*+8= mz’. It 
follows that 8| m, so m = 4t, reducing the equation to k* + 1 = 21’. It is well known 
that all solutions of u+ + v+ = 2w?’ are of the form u? = v? = w. (See, for example, 
Dickson, Introduction to the Theory of Numbers, 1929, p. 43, problem 4.) Thus 

= t= 1, so the only solution is y= 2, z = 4. 


II. Solution by E. P. Starke, Plainfield, N.J. Uf y> + 4y = 27, then (y? + 4y)? 
= y© + 8y* + 16y? = z*; also y® — 8y* + 16)? = (y° — 4y)’. Subtracting these two 
equations, we obtain (2y)* = z* — (y° — 4y)?. But it is known that there exists no 
solution in nonzero integers for u* — v+ = w*. (See Wright, Theory of Numbers, 
1939, p. 104, exercise 2, for example.) Hence either 


2y=0 and z*=(y? —4y)? or z=2y and y?—4y =0. 


Since y > 0, the first alternative is untenable. From the second possibility, we find 
the unique solution to the problem is y = 2, z = 4. 


III. Solution by E. Trost, Technikum Winterthur, Switzerland. We consider 
the more general equation 


(1) ay? + 4a°b*y = z?, y,z>0, 


where a, b are positive integers. If (y, z) is a solution, then the quadratic polynomial 
P(t) = ay>t? — z*t + 4a%b*ty has the rational zero t=1. Therefore z+ — (2aby)* 
must be the square of an integer. Taking into account that y > 0 and applying a 
result of Fermat, we infer that z = 2aby. Now we see that (1) has the unique solution 
(y, Z) = (2ab’,4a’b*). (For further applications of this method, see E. Trost, Eine 
Bemerkung zur Diophantischen Analysis, Elem. der Math. 26 (1971), 60-61.) 


Solutions were submitted also by forty-one others. 


Return of the Rational Tangent 


E 2333 [1972, 87]. Proposed by D. E. Penney, University of Georgia 
If k, m and n are integers, then one solution of the equation 


78 ELEMENTARY PROBLEMS AND SOLUTIONS (January 
1 m 
— =k arc tan — 
4 n 


isk =m=n-=1. Find all others. 


Comment by Andrzej Makowski, Warsaw, Poland. The given equation implies 
that 


J. M. H. Olmsted (Rational values of trigonometric functions, this MONTHLY 52 
(1945), 507-508) proved that the only rational values of tan zr (r a rational number) 
are 0 and + 1. By virtue of this result m/n = 0 or +1. But m = 0 is impossible, so 
that k = + 1. Thus the only solutions are (k, m,n) = (1,j,j) or (—1,/, —j) wherej 4 0 
is an arbitrary integer. 


Also solved by M. T. Bird, W. J. Blundon, David Brooks, Frederick Carty, Allen Charnow & 
Hwa Tang, R. M. Giuli, Michael Goldberg, M. G. Greening (Australia). M. Hirschhorn (Australia), 
Hans Kappus (Germany), Vaclav Konetny, Carolyn MacDonald, L. E. Mattics, F. G. Schmitt, 
Jr., R. E. Shafer, G. S. Sidhu, R. Van Meter, Charles Wexler, A. Zujus, and the proposer. 


Editor’s comment: This old chestnut has been around for at least fifty years. The earliest reference 
seems to be R. S. Underwood, Supplementary note on the irrationality of certain trigonometric func- 
tions, this MONTHLY 29 (1922), p. 346. (This was noted by Schmitt.) Both Schmitt and Brooks note 
that the result can be found in Ivan Niven, /rrational Numbers, Carus Monograph No. 11, Corollary 
3.12, p. 41. Zujus found the result in a paper by E. A. Yasinovyi in the Russian magazine, Mathe- 
matics in School (1958). Charnow and Tang show the analogous result for the sine (and hence the 
cosine): the only rational values of the sine are the “‘obvious”’ ones. Underwood seems to be the 
first to have noted this also —see his On the irrationality of certain trigonometric functions, this 
MONTHLY 28 (1921), 374-376. 

It is interesting that even though the values of the trigonometric functions are only very rarely 
rational, the tabulated values are always algebraic! That is, if x is expressed as an integral number of 
degrees, minutes and seconds (so that it is commensurable with 7) then necessarily all of the trigono- 
metric functions of x are algebraic numbers. This was noted by Elijah Swift, Note on trigonometric 
functions, this MONTHLY 29 (1922), 404405, and again by R. W. Hamming, The transcendental 
character of cos x, this MONTHLY 52 (1945), 336-337. A more difficult question involves the degree 
(as an algebraic number) of the values of the trigonometric functions. See D. H. Lehmer, A note on 
trigonometric algebraic numbers, this MONTHLY 40 (1933), 165-166. 

For other related problems see B. H. Arnold and H. Eves, this MONTHLY 56 (1949), 20-21, 
Problem 195 [1915, 27], Problem 3733 [1937, 113], and a note by Underwood in the ‘‘Question and 
Discussion” section [1922, 255]. Still another related problem was found by Sidhu in R. D. Carmi- 
chael, Diophantine Analysis, Dover, 1915. The problem is credited to Stormer (1899) and asks for 
all integer solutions of 


m arctan (1/x) + n arctan (1/y) = k 7/4, 


with k, x, and y positive. It is there claimed that the only solutions are (k, m,n, x, y) = C1, 1, 1, 2, 3) 
or (1, 2, -1, 2, 7) or (1, 2, 1, 3, 7) or (1, 4, -1, 5, 239). 


1973] ELEMENTARY PROBLEMS AND SOLUTIONS 719 


Congruence Properties of [r”] 


E 2334 [1972, 87]. Proposed by Erwin Just, Bronx Community College 

Let k be an arbitrary positive integer. Prove that there exists a non-integral real 
number r > 1 with the property that k divides [r"| for every positive integer n. (The 
square brackets denote the greatest integer function.) 


I. Solution by J.G. Mauldon, Amherst College. One such number is r= 
k +44 ./k? +4, which is the larger root of the quadratic equation 


— (2k + 1)x +k =0. 


Note that the other root s of this equation satisfies 0< s <1. 
Now define u, =r" +s" for n = 0,1, 2,---. It can be verified that u, satisfies the 
difference equation 


Un+2 = (2k + 1)un at _ ku, 


with initial values uy = 2 and u, = 2k + 1. An obvious induction shows that u, is an 
integer and u, = 1 (mod k) forn2 1. But r" =u, — s", so that u, —-1 <r" <u, and 
consequently [r"| = u, — 1 = 0 (mod k) as required. 


II. Solution by Richard Scoville, Duke University. We show the following 
more general result: Let a,,a,,-:- be arbitrary integers between 0 and k — 1 inclu- 
sive. Then there is a nonintegral r >1 satisfying 


(1) [r"] = a, (mod k) 


for every n=1,2,::: 

To show this, we choose by induction an increasing sequence {x;} and a decreasing 
sequence {y,;} as follows: Let x, =k*+a, and y, =k*+a,+1. Assume that 
X4,°'*,X, and y,,---,y,; have been chosen so that the following are satisfied: 

(a) Xy <2 <" "<x; Wo porr< yy 

(b) yi- x} = 1 for i=1, 2,° 

(c) Ifx,;<r<y,, then [7] = “a (mod k) for i = 1,2,---,/. 


(Editor’s comment: Note that (b) and (c) together are equivalent to the assump- 
tion that for each integer i there exists an integer m, such that x;'= km, + a, and 
yi = km,;+a;+1. In particular, note that x} and y; are always integers.) 

By the Mean Value Theorem and the inductive assumption (b), we have 


yiti_ xit! = (yl)! _ (fyi tt > (- + “)s, Sx, Sk. 
J x J J <3 

J 
Since yi*4 —x/*!>k? >k +1, there are at least k integers lying strictly between 
x47? and ot —1; one of them is congruent to a;,,(modk). Let it be xi} { and 
define yet to be x4F 141; then 


80 ELEMENTARY PROBLEMS AND SOLUTIONS [January 


i+ j+1 7 jti j+1 
Xp  <SXj41 SVj41 S Vj 


and the inductive step is completed. 

The number r which can be characterized as either the common limit of the 
sequences {x;} and {y;} or the sole element of the set ()?.,[x;,y,] can now be 
seen to have property (1). 


Also solved by P. K. Garlick, G. A. Heuer, L. E. Mattics, The 3—S Group of New York, Ruby 
Williams, and the proposer. 


Continuous Two-to-One Functions 


E 2335 [1972, 88]. Proposed by J. P. Celenzu, Bayside, N.Y. 
Does there exist a continuous function from the reals to the reals which is 
precisely two-to-one? 


Comment by M. L. Klasi and C. A. Grimm, South Dakota School of Mines 
and Technology. In his published solution to E 1094 [ 1954, 425], Azriel Rosenfeld 
shows that a continuous function (real-valued with domain an interval) which takes 
on no value more than twice must take on some value exactly once. A continuous 
two-to-one function therefore cannot exist. 


Also solved by the proposer and 64 others. 


Editor’s comment: All of the solutions used in one way or another the intermediate-value 
property of continuous functions and the connectedness of the domain. Note however, that the 
transformation f (z) = z? in the complex plane maps the unit circle (which is connected) onto itself 
in precisely two-to-one fashion. 

Reference was made to a number of related articles including O. G. Harrold, Exactly (k, 1) 
transformations on connected linear graphs, Amer. J. Math. 62 (1940), 823-834; O. G. Harrold, The 
non-existence of a certain type of continuous transformation, Duke Math. J. 5 (1939), 789-793; 
J. Mioduszewski, On two-to-one continuous functions, Rozp. Mat. 24 (1961) p. 36, and J. H. 
Roberts, Zwo-to-one transformations, Duke Math. J. 6 (1940), 256-262. Harrold shows that if f 
is a (Continuous) map from an arc to an arc which is at most two-to-one and which preserves end- 
points, then fis necessarily a homeomorphism (and thus one-to-one). Roberts shows that there does 
not exist a continuous 1wo-to-one transformation on a closed two-cell. 

Generalizations are always welcomed by the editors. Several readers show that there exists a 
continuous #-to-one function from the reals to the reals if and only if 1 is odd. These readers are: 
David Kelly, Robert Patenaude, Howard Penn, Mary Powderly, Kenneth Rosen, E. F. Schmeichel, 
Gary Sherman, Robert Spira, and Clifford Wagner. Basilios Krikeles, an undergraduate, noted this 
theorem, but did not prove it. The non-existence of continuous even-to-one functions is proved in 
an analogous manner to the special case n = 2, whereas the existence of continuous odd-to-one 
functiéns is shown by construction. One of the simplest is due to Schmeichel: Let n be odd and 
define the graph of fon the interval [k, k -+- 1] where k is any integer, to consist of the polygonal arc 
which joins in order the points (k,k), (k + 1/a,k + 1, (k + 2/n,k), (k + 3/n,k + 1),..., 
(kK +1,k +1). 

We note the similarity of this problem to E 1715 [1965, 784] which exhibits a continuous function 
on [0, 1] which is precisely Xg-to-one; the result of this problem is also found in B. R. Wenner, 
Continuous, exactly k-to-one functions on R,Math. Mag., 45 (1972) 224-225. 


1973] ELEMENTARY PROBLEMS AND SOLUTIONS 8] 
Mobius Transformations of Finite Order 


E 2336 [1972, 88]. Proposed by William Fortney, Dumaguete City, Philippines, 
and Robert Breusch, Amherst College 

Consider the group of bijective rational functions over the complex numbers 
(with oo) under the operation of composition. For any positive integer n, charac- 
terize the elements of order n. 


Solution by J. G. Mauldon, Amherst College. Denote the given group by G; 
it can be shown that feG if and only if f is a fractional linear or MObius trans- 
formation: 


ad—bc £0. 


(A Mobius transformation is normalized if ad — bc = 1; every Mobius transfor- 
mation can be normalized by dividing through all of the parameters by a square 
root of ad — bc.) 

Suppose that feG is of finite order n >1. It is well known that every Mobius 
transformation (other than the identity) has either one or two fixed points (finite 
or infinite). See H. Behnke and F. Sommer, Theorie der analytischen Funktionen 
einer komplexen Verdnderlichen, Springer-Verlag, Berlin, 1962, p. 327. If f has 
one fixed point, then f is conjugate (in the group-theoretic sense) to a transformation 
with a single fixed point at oo, i.e., to a translation. But a (nontrivial) translation 
cannot have finite order, and conjugate elements have the same order, so this case 
is impossible. 

Thus f must have two distinct fixed ponts. If these fixed points are 0 and oo, 
then f(z) = az for some fixed complex number «; that is, f is a dilation/rotation. 
Since f has order n, it follows that « = w, where is a primitive nth root of unity. 
In general, f is conjugate to a dilation/rotation; that is, f is conjugate to g where 
g(z) = wz and @ is a primitive nth root of unity (ibid. pp. 327 1f.); thus fis elliptic. 
This characterizes the elements of order n: they are conjugate in G to rotations 
by 2zk/n, where (k,n) = 1. 

In terms of the original parameters a, b, c, d, it is known that if 


az +b (2) = 224? 
cz + d’ Be) ra dd’ 


f(z) = 


where ad — bc = a’d' — b’c' = 1, then f and g are conjugate if and only if 
a+d=+(a’+d')(ibid. pp. 329-330). If g(z) = wz, where wm = e?"*’", then we can 


take a’ = e@8/™ gt = eM and b’ = c’ = 0. Thus if 
fe = 82 ad — be = 1, 


then f is of order n> 1 if and only if 


82 ADVANCED PROBLEMS AND SOLUTIONS [January 


at+d= x+(a’'+d’) = +2cos (=). 


where (k,n) = 1. In general (i.e., if f is not normalized), then let ./ad — bc denote 
either of the (complex) square roots of ad — bc. Then f is of order n if and only if 


a+d = +2,/ad—bce cos (=) (k,n) = 1. 


Also solved by the proposers. 


Editor’s comment: Let M denote the multiplicative group of complex 2 x 2 matrices with deter- 
minant | (the unimodular group). Then M and G are isomorphic under 


(? ‘) + f(z) = az+b 


C cz +d’ 


so that the problem is essentially that of determining primitive nth roots of the identity matrix in M. 
This was the approach taken by the proposers in their solution. Note that a + d is the trace of the 


matrix 


d 
An interesting discussion of the geometry of MObius transformations can be found in J. Hark- 
ness and F. Morley, Introduction to the Theory of Analytic Functions, Macmillan, 1898, pp. 27-45 
and 57-66. The authors also take up the special cases m = 2 and n = 3 of our problem. 
A related problem is E 2186 [1970, 531]. 


ADVANCED PROBLEMS 


All solutions of Advanced Problems should be sent to J. Barlaz, Rutgers — The State University, 
New Brunswick, N.J., 08903. Solutions of Advanced Problems in this issue should be typed (with 
double spacing) on separate, signed sheets and should be mailed before April 30, 1978. Con- 
tributors (in the United States) who desire acknowledgement of receipt of their solutions are 
asked to enclose self-addressed, stamped postcards. 


An asterisk (*) means neither the proposer nor the editors supplied a solution. 


5889*. Proposed by J. K. Doyle and F. A. Kuzam, Syracuse University 
Does there exist a topological group G with more than one point such that the 
fundamental group of G is isomorphic with G? 


5890*. Proposed by Harry Ruderman, Hunter College High School 
Prove or disprove that the minimum of | a* + b+ — c*| is equal to 64, where 
lsa<b<e. 


5891. Proposed by V. Dlab and C. M. Ringel, Carleton University, Ottawa, 
Canada 

Prove or disprove that the left and right lengths of every factor ring R/I coincide, 
where R is a local quasi-Frobenius ring and J is an ideal of R. A local quasi-Frobenius 
ring is an artinian local ring with a unique minimal left and a unique minimal right 
ideal (which necessarily coincide). 


1973] ADVANCED PROBLEMS AND SOLUTIONS 83 


5892*. Proposed by John Myhill, University of Leeds, England 

Let A be a subset of the plane. For pe A, ¢ > 0, let N,(p) = {qe A: d(p,q) < e}. 
A is called uniform if for any p, qe A and any e > 0, there is an isometry of N,(p) 
onto N,(q) taking p into q. Are there any uniform, closed connected sets other than 
the straight line, the circle, and the entire plane (and the empty set and a single 
point)? 


5893. Proposed by F. K. Dashiell, Jr., University of California at Berkeley 

A Borel subset of the unit interval J =[0,1] is called metrically dense if its 
intersection with each open interval in J has positive Lebesgue measure. Call a Borel 
subset D of I metrically balanced if both D and its complement J — D are metrically 
dense. Prove that any Borel set S in J is the symmetric difference of two metrically 
balanced Borel sets D, and D,, that is, S = (D, — D,) U(D, — D,). 


5894. Proposed by J. F. Kemp, Jr., Amoco Research, Tulsa, Okla. 

If F,(x,), F2(X2),°*:, F,(x,) are n probability distribution functions, then prove 
that min(F,(x,), F2(x,),--:,F,(x,)) is an n-dimensional probability distribution 
function with marginals F,(x,), F>(x2),:-:,F,(x,). 


SOLUTIONS OF ADVANCED PROBLEMS 


Set of Second Category 


5814 [1971, 911]. Proposed by A. C. Segal, University of Alabama, Birming- 
ham 


It is known that a subset of the real line which ts either first category or measure 
zero must have a void intersection with uncountably many Lebesgue cosets (i.e., 
cosets modulo the subgroup of rationals). Prove or disprove the converse. 


Solution by the proposer. This converse is false. Let S denote the union of all 
rational translates of the Cantor set. S is first category, has measure zero, and com- 
pletely fills the uncountably many cosets in which elements of the Cantor set appear. 
Therefore, T, the complement of S, has nonzero measure, is second category, 
and misses the uncountably many cosets which S fills. 


Also solved by R. P. Boyer, and by J. C. Morgan II. 
Groups with Center and Commutative Subgroup of Order p 


5815 [1971, 912]. Proposed by L. W. Shapiro, Howard University 
Show that there are no groups G of order p?” with center and commutator 
subgroup of order p, where p is a prime. 


Solution by Frank DeMeyer, Colorado State University. Let G be a p-group 
whose center and commutator subgroup both have order p. Let Z be the center of G. 


84 ADVANCED PROBLEMS AND SOLUTIONS [January 


Then G /Z is a p-group, so the center of G/Z is nontrivial. Let x €e G — Z represent an 
element in the center of G/Z. For any yeG, [x,y] =xyx7~ty7'eZ and x€Z, so 
for some y, [x, y] is a non identity element of Z. Since both Z and the commutator 
subgroup of G have order p they must coincide. Now Z is cyclic and G/Z is abelian, 
so Problem 5689 [1970, 1016] asserts G/Z = H x H for some abelian group H. 
Thus the order of G/Z is an even power of p, so the order of Gis an odd power of p. 


Also solved by D. Z. Djokovié, Vance Faber, M. G. Greening (Australia), W. H. Gustafson, W. 
M. Hill, A. A. Jagers (Netherlands), W. G. Leavitt, C. Y. Tang & H. T. Tang, Gomer Thomas, 
J. F. Watters (England), Mark Yu, and the proposer. 


Combinatorics of Matrices with 0’s and 1’s 


5816 [1971, 912]. Proposed by Solomon Leader, Rutgers University 


Let P be a nonempty, finite set with p members, and Q be a finite set with gq 
members. Let N,(p, q) be the number of binary relations of cardinality k with domain 
P and range Q. (Equivalently, N,(p,q) is the number of p x g matrices of 0’s and 
1’s with exactly k entries equal to 1 and no row or column identically 0.) Compute 
em —1)*"*N,(p, q)- 


Solution by Harry Lass, California Institute of Technology. Let A; be the 
event that row i has all zeros, i = 1,2,---,p, and let B, be the event that column / 
has all zeros, j = 1,2,---,q. It follows that 


N (p,q) = (7) —N (U A; U2] 


i=1 


is the number of px q matrices with exactly k entries equal to one, all other entries 
being zero, with no row or column identically zero. By the law of inclusion—ex 


clusion we have 
(p—1)q (q—I)p 
= a) 


(DA) 
( 


N(P.q) = ("7 


| (| ae 


From Di.,(—1)*"*() = 1 for n = 1 it follows that 


pq 


ZHI IN) 


I 


Po ae (P\ IO > (4 
ze (7) zeu(%) 


= (HIP (= nitt = (-1*4 


1973 | ADVANCED PROBLEMS AND SOLUTIONS 85 


Also solved by Robert Breusch, M. G. Greening (Australia), D. J. Kleitman, P. R. Stein, B. R. 
Toskey, and the proposer. 


Laplace Transform of a Differentiable Function 


5817 [1971, 912]. Proposed by M. F. Neuts, Purdue University 

Let f(t) be the characteristic function of a probability distribution F(- ) whose 
(n + 1)st moment is finite. For all 2 > 0 the integral [(A) =(ge ~“"e(t)dt exists. Prove 
that 


1 . ye 
lim LY PAT =H, 
A> +00 IQA) v=0 
where p,, is the v-th moment of F(- ). 
As a particular case, obtain the classical asymptotic expansion 


I~ $U)~ Je ePID (PNY 3 (QV — 1) 


7 v=1 


for the normal distribution function as A > + co. 


Solution by A. A. Jagers, Twente University of Technology, Netherlands 

From the given context it is clear that a stronger statement is possible, 
n 
lim An*tT(A) _ »y iv) —v~-1 uu) — (). 
A> + 0 v=0 

Since F has a finite (n + 1)st moment, its characteristic function f can be expanded 
n+ 1 steps in a Maclaurin series: 
n aT fat Darye"* 1 


f(t) _ | evare = Rr yi “ati!” 


+ 


0<6<1, where f"*” is continuous and bounded by 


[ |x|"*'dF(x) = M<oo. 


a - CO 
Hence, in terms of Laplace transforms, 


Ay= XL PAT + RA), 
v=0 
where | R,A)| < Mi1~"~*; the statement now follows. Finally the given asymptotic 
expansion for the normal distribution ¢ is obtained as the particular case F = 
noting that in this case f(t) = exp(— t? /2) and so I(A) = /2m exp(A?/2)- (1 — (A), 
uy, = 0 for v odd, and p’ =1-3-5---(2k —1) for v= 2k. 


Also solved by S. A. Book, P. W. A. Dayananda (Singapore) A.K. Gupta & L. Jensen, J.C. 
Hickman, Harry Lass, O. P. Lossers (Netherlands), O. G. Ruehr, P. H. Young, and the proposer. 


86 ADVANCED PROBLEMS AND SOLUTIONS [January 


Simultaneous Congruences 


5818 [1971, 912]. Proposed by Erwin Just, Bronx Community College 
Let g = 5 be an integer of one of the forms 6n -+ 1. Must there exist a prime p and 
an integer x for which 
x?~1'_ x +120 (mod p) and x?=1 (mod p)? 


Solution by David Spear, student, City College, New York. We prove a stronger 
result: Given any positive integer q other than 1,2,3 or 6, there exist a prime p and 
an integer x such that x?~' — x+1 = 0 (mod p) and x‘’= 1 (mod p). 


Case I. q is odd. Let f, denote the nth Fibonacci number (f,4, =/, +f,-1, 
fo = 90, f, = 1). It suffices to let p be any odd prime divisor of (f,-; +41), if there 
is any, and to let x = 4(p + 1) Gf,4; + 1). For let a =f,4,. Then 


Jq-1 +fq+1 = 0 (mod p) => f,-1 = — a (mod p), 
Iq =Sa+1 —Sq-1 > Sg = 2a (mod p), 

fe =fo-tho+1 + 1 (with g odd) > 5a? = 1 (mod p), 
2x = (p+ 1) (a+ 1)=> 2x = 5a +1 (mod p). 


Then 4x? = 25a*+10a+1 = 5+10a+1 = 2(5a+1)+4 = 4x +4 (mod p), 
whence x? = x + 1 (mod p).A simple induction yields x* =f,x +f,-1, k=1,2,3,-°. 
Then 


x? = fx +fy-; = 2ax —a = a(S5a+1)-—a=S5a* =1 (mod p). 


Now x(x?-2 —x +1) =xt?—x?+x=1-—x?+x=0 (mod p), but x 40 (because 
x? = 1 (mod p)) implies x?~* — x + 1 =0 (mod p). 

These are the desired relations, but it remains to show that (f,-; +/,+1) must 
have an odd prime divisor. Suppose, instead, that f,_, +/,41, = 2° for some integer 
s. Given q#1, q¥%3. Thus q 25, which implies f,_; +/,4; 28, so that s 2 3. 
Now 2| f, if and only if 3|n and 16|f, if and only if 12|n. Also fog=fq( fy—1 +fa+1)s 
so fx, = 2°f,. Then 


2°\foq=> 2| fag 3|2q> 3|q>2|f,> 21 |fo,> 16|fog> 12|2q > 6|q > 2\ 4, 
which is a contradiction. This completes the proof for case I. 
Case I. q is divisible by 4. Let q = 4m, p=5, x = 3. Then 
34 = | (mod 5)> 3*" = 1 (mod 5), ie., x? = | (mod p), 


from which 3(37"7' —3+1)=3*" —3?+3=1-—9+3=0 (mod5). Hence 
37-1 _-341=0 (mod 5), ie, x?-' —x +1=0 (mod p), fulfilling the require- 
ments. 


1973] ADVANCED PROBLEMS AND SOLUTIONS 87 


Case III. q is even but not divisible by 4. Then gq = 2t for some odd positive 
integer t. Given q # 2, q % 6, we have t #1, t 43, therefore case I applies. That is, 
there exist a prime p and an integer x such that x‘~' —x +120 (mod p) and 

‘= 1 (mod p). Then 


xi-teoxypdsex tx 4+ 1=x'(x')—x+1=x'7'—x+1 =0(modp) 
and further x? = x*' = (x')? = 1 (mod p). This completes the proof. 


Also solved by Robert Breusch, Irving Gerst, A. A. Jagers (Netherlands), L. E. Mattics, P. L. 
Montgomery, and the proposer. 


Convergence of Function Iterates 


5820 [1971, 1027]. Proposed by Julio Cano, Findlay College, Ohio 

Let K be a compact subset of the line and fa continuous function from K to K. 
Suppose that xo € K has the property that every cluster point of the sequence {f”(x9)} 
is a fixed point of f. Show that the sequence is convergent. Show also that this result 
fails in two dimensional Euclidean space. 


I. Solution by S. J. Bernau, University of Texas at Austin. Write x, =f"(Xo). 
We ignore the trivial case when some x,, is fixed by f, 1.e., we assume that no x, 
is a cluster point of {x,}. Let b = limsupx,,, a = liminf x,, and suppose a <b. If 
there exists k such that x,é[a,b] then, since x, is not a cluster point of {x,} 
there is a non-empty open interval (c,d) < (a,b) such that x,e(c,d) and x, ¢ (c, d) 
if nk. We conclude that in any case there is a non-empty interval (u,v) such 
that (— o,u) and (v,o) both contain infinitely many x,. Hence we choose an 
infinite subsequence {x,,} of {x,} such that x,, <uandx,,,, > v forall k. Clearly, 
no cluster point y, say, of {x,,} can be fixed by f (continuous) since y <u and 
f(y) 2 v. Thus a= b and {x,} converges. 


II. Solution by R. O. Davies, The University, Leicester, England. To show 
that the proposition fails in the plane, use polar coordinates and define f on the 
annulus 4<r<1 by f((r,0)) =(2—r)7!, 0+1-—r). Then f is continuous, and 
every point of the circumference r = 1 is fixed. When x, = (4, 1) we find that 


n 


1 
n = i eee ——ee ° 
f'"(Xo) (yl +44 +): 
hence the cluster points of {f"(x,)} are the points of r= 1 and are fixed, but the 


sequence is not convergent. 


Also solved by Skagi Aggi, Max Broberg (Sweden), Bruce Ferrero, R. B. Israel, A.A. Jagers 
(Netherlands), J. G. Mauldon, S. S. Mitra, Nicholas Passell, and the proposer. 


THE AMERICAN 


MATHEMATICAL MONTHLY 


(FOUNDED IN 1894 BY BENJAMIN F. FINKEL) 
THE OFFICIAL JOURNAL OF 


THE MATHEMATICAL ASSOCIATION OF AMERICA 


VOLUME 80 


CONTENTS 


Award for Distinguished Service to Professor Raymond L. Wilder . 

Award of the 1973 Chauvenet Prize to Professor Carl Douglas Olds . ; 

The Elementary Cases of Landau’s Problem of Inequalities Between Derivatives . 
Cooke I. J. SCHOENBERG 

Types of Fully Ordered Groups Lok. . . . . D. P. MINASSIAN 

The William Lowell Putnam Mathematical ‘Competition . . .  . J. H. McKay 


MATHEMATICAL NOTES 


The Area of a Hypersphere in Riemannian Space . . . . . B.A. FUSARO 
An Area Theorem for Schlicht Functions . - 2 2. . . J. L. ULLMAN 
On Set Points of Discontinuity. . . .. . . . «  . O. T. ALAS 
Generalized Fibonacci Number Triples. . . A. G. SHANNON AND A. F. HORADAM 
Ambivalence in Alternating Symmetric Groups... . CLAIRE PARKINSON 


RESEARCH PROBLEMS 
Can @(n) Properly Divide n—1? . . . . . . . . » . +. RONALD ALTER 


CLASSROOM NOTES 
A Discovery Approach toe. . Boe ee eee SPL TULL 


Simple Proofs of Two Estimates for @. 2. 2. ww ew UR. BB. DARST 


MATHEMATICAL EDUCATION 
The Lecture Method in Mathematics: A Student’s View . . . .M. W. Ham 


(Continued on inside cover) 


FEBRUARY 


NUMBER 2 


CODEN: AMMYAE 


117 
120 


121 
159 
170 


179 
184 
186 
187 
190 


192 


193 
194 


195 


1973 


ELEMENTARY PROBLEMS AND SOLUTIONS. . . . . . 2.0. ee ee 202 


ADVANCED PROBLEMS AND SOLUTIONS . . . . . . 2 eee ee 208 
REVIEWS . . wwe ee ee D4 
News AND NOTICES . . . . . eee ee ee 8] 
MATHEMATICAL ASSOCIATION OF AMERICA . . . . ea a 82 
Calendars of Future Meetings . . . . . . 2... eee 282 


NOTICE TO AUTHORS 


Specialized research is usually unsuitable; see Statement of Policy (vol. 76, p. 2). Manuscript preparation: Please 
use the Manual for Monthly Authors (vol. 78, p. 1) and follow the format in current issues of the MONTHLY. 
Manuscripts should be typewritten, triple-spaced with wide margins; submit two copies and keep one for 
protection against loss. 

Backlog: Main Articles 12 months, Math. Notes [5 months, Research Problems 7 months, Classroom Notes 
11 months, Math. Education 10 months. 


EDITORIAL CORRESPONDENCE AND MAIN ARTICLES: to HARLEY FLANDERS, American Mithe2- 
matical Monthly, Tel Aviv University, Ramat Aviv, Israe! (see Notice, vol. 77, 1970, p. 555); NOTES, etc.: 
to the corresponding Associate Editor; ADVERTISING CORRESPONDENCE: to Raout HAILPERN, 
Mathematical Association of America, SUNY at Buffalo, Buffalo, N. Y. 14214; CHANGE OF ADDRESS 
and SUBSCRIPTIONS: to A. B. WILLcox, Mathematical Association of America, 1225 Connecticut Ave., 
N.W., Washington, D.C. 20036. 


HARLEY FLANDERS, Editor 
ASSOCIATE EDITORS 


JOSHUA BARLAZ J. G. HARVEY SEYMOUR SCHUSTER 
E.R. BERLEKAMP ERIC S. LANGFORD J. ARTHUR SEEBACH, Jr. 
JANE W. DI PAOLA P. D. LAX E. P. STARKE 

ROBERT GILMER ARTHUR MATTUCK LYNN A. STEEN 
RICHARD GUY M. W. POWNALL JAMES WENDEL 

RAOUL HAILPERN GIAN-CARLO ROTA 


Annual dues for members of the Association (including a subscription to the American 
Mathematical Monthly) are $12.50. For nonmembers the subscription price is $18.00. 


PUBLISHED BY THE ASSOCIATION at Washington, D. C., and Menasha, Wisconsin, during the months of January, 
February, March, April, May, June-July, August-September, October, November, December. 


Second-class postage paid at Washington, D. C., and additional mailing offices. 
Copyright © The Mathematical Association of America (Incorporated), 1973 


PRINTED IN THE UNITED STATES OF AMERICA 


AWARD FOR DISTINGUISHED SERVICE TO 
PROFESSOR RAYMOND L. WILDER 


This year’s Award for Distinguished Service goes to aman with wide interests. He 
is known for his contributions to mathematics and logic, his fondness for anthropo- 
logy and the history of the development of mathematical concepts, and his service 
through excellence in teaching and able leadership of mathematical organizations. 

Raymond L. Wilder was born in Palmer, Massachusetts in 1896. His versatility 
showed itself early. He was quite good at the piano, and the local movie house hired 
him to accompany their silent movies. 

While Ray liked mathematics, logic, history, and anthropology, he did not see a 
ready market for a knowledge of these subjects and decided to study actuarial mathe- 
matics. Brown and Texas were leading universities in this area at that time. Wilder 
pursued his study of actuarial mathematics through the bachelor’s and master’s degree 
at Brown University, and then went to the University of Texas to continue these 
studies. 

Wilder wanted a better acquaintance with pure mathematics, and asked to enroll 
in one of Professor R. L. Moore’s classes in topology. Moore at first refused him ad- 
Mission, since Wilder’s interest in topology was only secondary. Moore remarked 
that he doubted that Wilder would like the rigors of proving theorems, and per- 
haps would not be much good at it even if he did. However, Wilder persisted that he 
was really interested in pure mathematics and answered some of Moore’s questions 
(such as, ““What is an axiom ?”’) well enough that Moore relented and let him enroll. 
For a while he was ignored by Moore, but as Wilder was able to prove some difficult 
theorems, Moore’s enthusiasm began to grow. Wilder continued his actuarial studies, 
but when Moore learned that Wilder had solved a problem that had baffled J. R. 
Kline and others, he suggested that Wilder write it up for a thesis promptly. The 
deadline for taking language and qualifying exams had already passed, but Moore 
cut through red tape and arranged for Wilder to take his exams after the deadline. 
Wilder took his Ph. D. in topology that very same year and gave up his actuarial 
studies. 

Wilder spent two years at Ohio State University after finishing at Texas. He then 
went to the University of Michigan and worked his way through the ranks to become 
their first Research Professor of Mathematics. Professor Wilder was very influential 
in helping build the University of Michigan into the giant center of excellence in 
mathematics that it is today. One year he was their Henry Russel Lecturer. This is the 
University of Michigan’s highest award, presented for outstanding scholarship and to 
a person who has reached full maturity in his scholarly work. 

Professor Wilder’s students report that he is a great teacher. Not that his lectures 
were always polished (which they were not) but, more important, they were stimulating 


117 


* "7 
Say at! 
Fe 


118 AWARD FOR DISTINGUISHED SERVICE [February 


and challenging. Word got around that Wilder not only taught mathematics, but 
about mathematics. His course in Foundations was widely discussed, even outside 
mathematical circles, and the course became one of the most popular on campus. 
The general student learned the history of mathematical concepts and that mathe- 
matics was an important part of our culture. Each found in the course something 
related to his own interests. The course appealed to a diversified audience. 

Professor Wilder taught all students in his classes, and not just the elite. He 
regarded an ordinary student reaching to the limits of his capacity as more exciting 
than an excellent student just coasting along. While he would not try to make a 
purse out of a sow’s ear or turn each student into a mathematics major, he challenged 
each student according to that student’s ability, and encouraged the student to 
develop his own talents. Some students who had not intended to major in mathematics 
changed their majors when they became impressed with the cultural importance of 
mathematics and were fired with the excitement of discovery. Wilder stimulated sup- 
erlor undergraduates to become mathematicians by admitting them to his advanced 
courses and letting them talk, prove theorems, and learn by doing. The article on the 
Distinguished Service Award to Ed Begle four years ago reported that Begle received 
his predoctoral training at the University of Michigan, where under the influence of 
Ray Wilder he became interested in topology. 

Professor Wilder is superb in leading others to do research. He directed the theses 
of twenty-five students, including Leon Cohen, Paul Swingle, Sam Kaplan, Morton 
Curtis, Alice Dickinson, Joe Shoenfield, Tom Brahana, Frank Raymond, and Kyung 
Kwun. He influenced the work of students who wrote their theses with others, includ- 
ing Norman Steenrod (whose first paper came from work he did in Wilder’s class) and 
Stephen Smale (whose first two papers were connected with Wilder’s seminars and 
questions raised by Wilder). Wilder’s seminars were lively affairs that had a stimulat- 
ing impact on research. They influenced not only the work of graduate students, but 
also the research of Wilder himself and that of his many colleagues who attended. 

One of the things that makes Professor Wilder so successful as a teacher is the high 
quality of his research. His earlier research dealt with properties of the plane and con- 
tinuous curves, but he broadened his interests to higher dimensions and along with his 
students developed the important notion of generalized manitolds. He wrote over 70 
research papers and three books. His colloquium volume is a profound treatment of 
the topology of generalized manifolds; his second bcok dealt with foundations; the 
third is concerned with the evolution of mathematical concepts. The breadth of 
Wilder’s scholarship is evidenced by the fact that although most of his papers deal 
with geometric topology, he addressed the International Congress of Mathematicians 
on “The Cultural Basis of Mathematics,” his Gibbs lecture was entitled ‘“Trends and 
Social Implications of Research,” and one of his latest papers appears in a medical 
journal. 

Another quality that makes Ray Wilder a great teacher is his interest in people. 
The humanitarian aspect of his character is one of his finest assets. He likes people and 


1973 AWARD FOR DISTINGUISHED SERVICE 119 


they like him. Students sought his council. He became to many a personal friend and 
one whom they never forgot. His colleagues report that students from long ago have 
dropped by to visit with Ray and enjoy the memories of his gentle introduction to 
mathematics, culture, history, and humanity that thev recall with so much relish. Ray 
would remember them and usually prove it with an anecdote. Wilder’s students have 
a warm place in their hearts for him and his talented and charming wife, Una. 

A dramatic shift in the position of the United States in mathematical affairs occur- 
red just prior to and during World War II. Immigration of leading foreign mathe- 
maticians played a big part in this. Professor Wilder was instrumental in influencing 
Michigan and the federal establishment to find places for some of these talented 
immigrants. His motivation was both humanitarian and scientific. Reactions such as 
‘surely there must be an American just as well qualified”’ had to be overcome. Wilder’s 
efforts in getting Sammie Eilenberg to join Michigan’s staff brought great mathe- 
matical rewards. Our country today is much stronger mathematically because of the 
efforts of Wilder and others during these trying times. 

Professor Wilder has devoted himself to working with mathematical organizations 
as well as to teaching and research. He has been an advisor to the Mathematics 
Division of the Air Force Office of Scientific Research, the NRC Fulbright Committee, 
the Michigan Mathematical Journal, and to many educational groups. He has been 
active on committees of both MAA and AMS, and has served as president of both 
organizations. He continues to work through these organizations for the improve- 
ment of teaching, the promotion of research, and the well-being of people. 

Professor Wilder contends that mathematics is one of the most important cultural 
components of every modern society. He feels that a knowledge of mathematics and 
its methods should be a part of the intellectual and cultural background of each well- 
trained person; whether he be a teacher, businessman, legislator, public servant, or 
housewife. People well trained in mathematics are often able to analyze in a unique 
fashion. Wilder believes that the need for mathematically trained people is increas- 
ing — but in areas not traditionally associated with mathematics — positions in 
government, industry, economics, and particularly in managerial positions. New 
courses and new teaching techniques will be needed to give these sorts of people the 
special kind of mathematical training that will appeal to them and be most useful to 
them. We need more Wilder-type teaching. 

R. H. BING 


AWARD OF THE 1973 CHAUVENET PRIZE TO 
PROFESSOR CARL DOUGLAS OLDS 


The Board of Governors of the Mathematical Association of America at its meet- 
ing on August 27, 1972, at Dartmouth College, voted to award the 1973 Chauvenet 
Prize to Professor Carl Douglas Olds for his paper ““The Simple Continued Fraction 
Expansion of e,’’ which appeared in this MONTHLY, 77 (1970) 968-974. 

A certificate and monetary award in the amount of five hundred dollars was pre- 
sented to Professor Olds at the time of the annual business meeting of the Associa- 
tion on January 28, 1973, in Dallas. 

The Chauvenet Prize is awarded for a noteworthy paper of an expository or sur- 
vey nature published in English, which comes within the range of profitable reading 
for members of the Association. The purpose of the prize is to stimulate the writing 
of expository and survey articles. The 1973 Prize, awarded for a paper published in 
the three-year period 1969-71, is the twenty-first award of the Chauvenet Prize since 
its institution by the MAA in 1925. For the list of the names of previous winners, 
see this MONTHLY, 71 (1964) p. 589, 73(1965) pp. 2-3, 74(1967) p.3., 75(1968) pp. 3-4, 
7711970) pp. 117-118, 781971) pp. 112-113, and 79(1972) pp. 112-113. 

Professor Olds was born on May 11, 1912, in Wanganui, New Zealand. He 
received all his degrees at Stanford University, the A. B. 1936, the A. M. 1937, 
and the Ph.D. in 1943 under Professor J. V. Uspensky. From 1935 to 1940 and in 
the summer of 1942, he was an acting instructor at Stanford University, and from 
1940 to 1945 an assistant professor at Purdue University. Since 1945, he has been at 
California State University, San Jose, advancing through the ranks to full professor. 

Professor Olds has served the mathematical community extensively. His service 
to the Mathematical Association of America included the acting chairmanship of the 
Northern California Section for part of 1951, Secretary-Treasurer of that Section 
from 1952 to 1955, and Sectional Governor for the period 1956-58. He served as 
first editor of the MATHEMATICAL LOG, the official publication of Mu Alpha Theta, 
the national high school and junior college mathematics club. He was awarded the 
Mu Alpha Theta service plaque in 1966. 

Professor Olds’ skill as a teacher was recognized by the award to him of a Calif- 
ornia State College Distinguished Teaching Award for the academic year 1965-66. 

Professor Olds’ substantial contributions to various parts of number theory are 
contained in his many publications in a great variety of periodicals. He is also the 
author of the book Continued Fractions,.published as part of the New Mathematics 
Library of Random House in 1963. 

In accepting the Award, Professor Olds indicated how very pleased and honored 
he felt. He added that in the past, and especially during the last few years, the Editors 
of the MonTHLY have done a fine job in encouraging expository writing. He thought 
that more mathematicians would write such articles if they realized that expository 
articles do not have to be long, do not have to cover an entire field of study, and do 
not need to include every reference that exists on a subject. 


120 


THE ELEMENTARY CASES OF LANDAU’S PROBLEM OF 
INEQUALITIES BETWEEN DERIVATIVES 


I. J. SCHOENBERG, University of Wisconsin 


INTRODUCTION 


In 1913 Landau initiated in [5] a new kind of extremum problem: The sharp 
inequalities between the supremum-norms of derivatives. He wrote two further 
papers, [6] and [7], on this subject (see also [3, 139-142]). Here we are only con- 
cerned with his first paper [5]. A lively activity on this subject culminated in 1939 
with Kolmogorov’s remarkable paper [4], where Landau’s R-problem was solved 
for all values of n (Landau had solved it for n = 2 only). In 1941 Bang [2] gave 
a second proof of Kolmogorov’s theorem using the theory of almost periodic func- 
tions. Recently, the author gave a third proof in [13]. This third proof is in essence 
an elaboration of Landau’s original direct approach and may be regarded as an 
application of spline theory. The analogue of Kolmogorov’s theorem for the halfline 
R4 has recently been established in [11]. 

The present paper discusses for both R and R, those cases of Landau’s problem 
that require no knowledge beyond the elements of the Differential and Integral 
Calculus of functions of one variable. The novel contribution of this paper, besides 
the proofs, is the discussion of the extremizing functions in Theorems 4, 5, 6, and 7, 
for the R-problem, and Theorems 9 and 11 for the R,-problem. 

The author believes that the subject can be used to supplement the contents of 
a calculus course, of an introductory course in numerical analysis, or for lectures 
in undergraduate, or beginning graduate, seminars. In doing this there is a good 
deal of flexibility. The main object of discussion are the Euler splines &,(x), and the 
essential section of Part I is §2. The §§1 and 3 only furnish further background and 
may be omitted. If I were to make a selection, I would choose Theorems 1, 2, 4, 
Corollaries 1, 2, and Theorem 5. This choice was implemented on when in the 


I. J. Schoenberg received his Doctor’s Degree at the Univ. of Jassy, Rumania. In his thesis he 
initiated the theory of non-uniform asymptotic distribution of sequences, mod 1. He held positions 
at the Univ. of Jassy, Univ. of Chicago (Rockefeller Fellow), the Institute for Advanced Study, 
Swarthmore Coll., Colby Coll., and the Univ. of Pennsylvania before going to his present Professor- 
ship at the Mathematics Research Center, Univ. of Wisconsin. He has spent leaves-of-absence at 
the Ballistic Research Laboratories Aberdeen, Institute for Numerical Analysis U.C. L. A., Stanford 
Univ., I. A. S., and the Technion, Haifa. 

His main research interests are Diophantine approximations, moment problems and related 
topics, distance geometry, total positivity, approximation theory and practice. He edited Approxi- 
mations with Special Emphasis on Spline Functions (Academic Press 1969), and is preparing a mono- 
graph on Cardinal Spline Interpolation. Editor. 


121 


122 I. J. SCHOENBERG [February 


framework of the Visiting Lectureship Program of the MAA the author gave three 
one-hour lectures on this subject at Wichita State University on December 6 and 
7, 1971. He wishes to thank Professor Keith Moore, Albion College, and Professor 
William M. Perel, Wichita State University, for arranging these lectures. This 
experience encouraged the author to write this paper. 


I, THE EULER SPLINES 


1. Cardinal spline interpolation. Let m be a natural number and let .7,= {S(x)} 
be the class of functions S(x) having the following two properties 

(i) S(x)¢C"~'(R). 

(ii) The restriction of S(x) to every interval (v,v + 1) between consecutive in- 
tegers in a polynomial of degree Sn. 

Such functions S(x) are called cardinal spline function of degree n. Evidently 
1, < /£,, Where 7, denotes the class of polynomials of degree not exceeding n. We 
may even consider 9, the class of step-functions with discontinuities at the integers. 
Indefinite integration of the elements of “> gives the elements of “,, also called 
cardinal linear splines (the term “‘spline’’ can be used either as an adjective or as 
a noun). Integrating the elements of 1”, we obtain those of ,, also called cardinal 
quadratic splines a.s.f. The term ‘‘cardinal’’ is to remind us that we pass from one 
polynomial component of S(x) to the next at the integers. These transition points 
are called the knots of the spline. 

It is also useful to introduce the class 


(1.1) S* = {S(x):S(x +H eEF,}. 


The elements of .“* are again defined by the properties (i) and (ii), provided that 
we replace in (ii) the interval (v,v + 1) by (v — 4,v +4). The knots of S(x) are now 
half-way between the integers, and S(x) may be called a midpoint spline. 

With elements of the class ,, or perhaps “*, we may attempt to solve the 
following 


CARDINAL INTERPOLATION PROBLEM. Given the sequence of numbers 
(1.2) (Wy) = (05 V2 ¥=19 Yoo Vis V29°"*) 
we are to find S(x) such that 
(1.3) S(v) = y, for all integers v. 


We restrict our discussion to the case when (y,) is a bounded sequence. This means 
that for an appropriate K 


(1.4) | y,| <K for all v. 


A main result is the following 


1973] THE ELEMENTARY CASES OF LANDAU’S PROBLEM 123 


THEOREM OF CARDINAL SPLINE INTERPOLATION. We assume that (1.4) holds. 

1. If n is odd, then there exists a unique S(x)€F, such that S(x) is bounded 
for all real x and satisfies the interpolation conditions (1.3). 

2. If n is even, then there exists a unique S(x)¢S,* such that S(x) is bounded 
and satisfies (1.3). 


The first part of this theorem was first established by Subbotin [14]. For the 
complete theorem under more general conditions (the condition (1.4) is replaced by 
the requirement that y, should grow at most like some power of |v| as v—> + 00 
or v—>» — 00) see [10]. 

The theorem is trivial if n = 1, but is no longer so if n> 1. Indeed, a linear 
spline S,(x) satisfying (1.3) is immediately obtained by successive linear interpoaltion 
between consecutive ordinates y, and y,,,. The condition (1.4) is not needed in this 
case and S,(x) is evidently unique for any sequence (y,). 

Remarkable cardinal splines are obtained from the above theorem for particular 
simple sequences (1.2). Here are two examples. 


A. The fundamental splines. For the special sequence 
(1.5) y =1, y, =0if v #0. 


The theorem furnishes a unique bounded solution that we denote by L,(x). Thus 


(1.6) L,(0) = 1, L,(v) = 0 if v 4 0. 
Of course 

SF, if n is odd, 
(1.7) L,(x) € 

S* if n is even. 


n 


The following is also true: The unique bounded solution S(x) of the interpolation 
problem (1.3) may be represented by the formula 


(1.8) S(x)= 2 y,L,(x —v), 
where the series converges uniformly in every finite interval. This is a cardinal spline 


analogue of Lagrange’s interpolation formula (see [10]). 


B. The Euler splines. Very likely the most interesting examples of cardinal 
spline functions arise if we apply the above theorem to the sequence 


(1.9) y, =(-—1)’ for all v. 


For each n we denote the solution by &,(x) and call it the Euler spline of degree n. 
Thus 


124 I. J. SCHOENBERG [February 


SF, if n is odd, 
(1.10) &,(v) =(—1)" for all v, and @,(x)e 
S* if n is even. 


These properties, together with the requirement that @,,(x) is bounded, defines this 
function uniquely on the basis of the cardinal interpolation theorem. We may also 
apply (1.8) and define &,(x) by 


é,(x) = 5 (—1)"L,(@ — v). 


Our entire discussion so far was to show how the Euler splines fit into the theory 
of cardinal spline interpolation. However, this approach to &,(x) does not help us 
much, because we have not established here the general interpolation theorem, nor 
have we learnt anything concerning L,(x) beyond its existence and uniqueness. 
Fortunately, there is a direct constructive approach to the Euler spline &,(x) to 
which we now proceed. 


2. A direct construction of the Euler splines. Let f(x) be defined on R and 
integrable in every finite interval. 


DEFINITIONS. 1. We say that f(x) is even about the point x = a, provided that 
it satisfies f(x) = f(2a —x) for all x. Likewise f(x) is odd about x =a if 
f(x) = — f(a — x). 

2. We say that f(x) has the property Po, or f(x) €Po, provided that f(x) is even 
about x = 0, and odd about x = 1/2. 

3. We say that f(x) has the property P,, or f(x)¢€P,, provided that f(x) is 
odd about x = 0, and even about x = 1/2. 


LemMA 1. If f(x) € Po, or f(x) € P,, then f(x) is a periodic function of period 2, 
hence f(x — 2) = f(x). 


Proof. If f(x) € Po, then 
f(x) = —f-x) = —f(*-Y) =f2—x) =f(x — 2). 
If f(x) €P,, then 
f(x) =f -x) = -—f& —-1) = -f2-x) =f(x — 2). OD 
We may omit the proof of the easily established 


Lemma 2. If f(x) is even (odd) about x =a then |*f(t)dt is odd (even) about 
X=. 

Lema 3. 1. If f(x) € Po and go(x) = {of(ddt, then go(x) €P,. 

2. If f(x)eP, and g(x) = Ji. fat, then g,(x) € Po. 


1973] THE ELEMENTARY CASES OF LANDAU’S PROBLEM 125 


Proof. 1. Let f(x)€ Po. By Lemma 2 go(x) is odd about x = 0. Let us show 
that it is even about x = 1/2. By Lemma 2 applied with a = 1/2 we have 
1/2 


x 1/2 x 
Jo(X) = [ fat = [ f(dt + I , f(dt = 


1-x 


= 5 f(Hdt = go(1 — x). 


f(pdt + | . f(t)dt 


2. Let f(x) €P,. By Lemma 2 g,(x) is odd about » = 1/2. Let us show that it is 
even about x = 0. Again, by Lemma 2 


93(x) = I ; Sf(dt = | _ f(Ddt + | ; f(Hdt = | _ f(pdt + in (t)dt 


-|  soa=g-%. 0 
1/2 


We start with the function /o(x) defined by 


(2.1) fox) =(-1) ifvsx<v+l, 
| | | 1 | | 
| i ¢ toy yt bo 
1. ST tt 
fox) 0 | | fol) Or, 1, 2 _ 
L___| (__s _ | 


en 
om“ 
> 
wee” 
CS 
ign) 
oa 
é 
\ee” 
on) 


fal) tA sa(x) 0! VaU ! 


f(x) 


an OR wo ff 


S 


126 I. J. SCHOENBERG [February 


whose graph is the ‘‘square-wave’’ of Figure 1. From it we derive the functions 
(see Figure 1) 


(2.2) Ko = | foods, 0) = | Hoa, AO) = | ACM, 
1/2 O 1/2 

and generally 

(2.3) fos) = [ f,-s(Odt, 


where 
0 if n is even, 
(2.4) “, = 
1/2 if n is odd. 


LEMMA 4. We have that 


(2.5) Fin(X) ES, (n = 0, 1,2,---), 
and 

Po if n is odd, 
(2.6) hove | 

P, if n is even. 


Proof: (2.5) is clear from (2.3) and an earlier remark that an integral of a spline 
is again a spline of a degree by one unit higher. 

Also (2.6) follows from (2.3) and Lemma 3. Since fo(x)¢P,, we conclude that 
Fi (x) € Po and therefore f,(x)¢€P, asf. © 


Lemma 5.1. In [0,1] the functions f,,(x) are alternately strictly, convex or 
concave and vanish only atx = Oand x = 1. 
2. In [0,1] the functions f,,-,(x) are alternately strictly increasing or 
decreasing and vanish at x = 1/2 only. 
In particular 


2.7 (~ D'fax-10) > 0, (= D'fa(5) > 0. 


Proof: That f,(x) vanishes at 0 follows from (2.3), (2.4), and its vanishing at 1 
follows from (2.6), it being even about x = 1/2. (2.6) also implies that f,,_,(1/2) = 0. 
The remaining statements follow from (2.3) by induction in n: f(x) is strictly in- 
creasing, therefore /,(x) is strictly convex and therefore /;(x) is strictly decreasing. 
This implies the strict concavity of /,(x), a.s.f. © 


LEMMA 6. The functions defined by 
(2.8) E an—1(X) = fox—1(%)/fox-1) 


1973] THE ELEMENTARY CASES OF LANDAU’S PROBLEM 127 


and 
(2.9) EB an(X) = fan(% + 4)/Fox(2) 
are identical with the Euler splines as defined in §1B (see Figure 1). 


Proof: Indeed, it should be clear that the newly defined functions enjoy the 
properties (1.10) and that they are bounded, since |&,(x)| < 1 for all x. The unicity 
of the functions having these properties establishes the identity with the old definition. 
In any case for us (2.8) and (2.9) is the working definition of the Euler splines. ([] 

If f(x) is a bounded function defined on R, we define its norm by 


(2.10) [7] = sup | fe]. 


We shall be particularly concerned with the norm of &,(x) and of its derivatives 
and write 


(2.11) El =n, (Vv =0,L-yn). 
LEMMA 7. 
2.12 | em | _ | 6, 0)| if v is even, 
™ "" Ysay| ify is odd 
n 2 . 


Proof: (2.3) implies that 
(2.13) fr) =fir-() (Y= 0, 1). 
Moreover, we easily show that 
| f,(3)| if n is even, 
fl = f f,(0)| if n is odd. 
Let n = 2k, and let c = 1/f,,(4). By (2.9) and (2.13) we find 
EW(x) = + faP(% +4) = fan-y(% +4). 


By (2.14) this is seen to reach its largest absolute value at x = Oif vis even, and at 
x = 1/2 if vis odd. Similarly, using (2.8), we establish (2.12) ifn is odd. [] 


(2.14) 


3. The connection with the Euler polynomials. Let us denote by P,(x) the 
polynomial of degree n that represents the spline function f,,(x) in the interval [0, 1]. 
Thus 


(3.1) f(x) = P,(x) if OS x <1, P,(x)eE7z,. 
Thus, from Figure 1 we find 


P,(x) = 1, P,(x) =x —4, P,(x) = — - =, asf. 


2 
2 


128 I. J. SCHOENBERG [February 


Clearly (2.3), (2.4) imply that 
0 if n is even, 


(3.2) P,(x) = [ P,-s(dt, % = 
an 1/2 if n is odd. 


and therefore 
(3.3) P(x) = Py- (x). 


A sequence of polynomials, like our P,,(x), that is obtained by starting from Po(x) = 1 
and integrating successively, is called an Appell sequence. Integrating successively 
we obtain 


x* a,x 
P(x) =x +4,, P2(x) =a tn + aye, 


the nth polynomial being 


n n-1 n-2 
~*~ 4,4 * {hn * - Gn-1 XX | An 
BA) PROT tiG@-p tampa) t@—pi it a 


Here a,/1!, a,/2!,--- are the successive constants of integration. 

Appell has observed that the infinite string of relations (3.4) can be described 
by a single relation involving series of powers of z. Indeed, multiplying the power 
series 


+ An on XZ ~ x" n 
(3.5) g(z) = 2 ik and e” = 2 ie 
and using (3.4) we find that 
(3.6) g(zje™* = & P,(x)z". 
0 


The left side is called the generating function of the polynomials P,,(x). 
Let us determine g(z) for the particular sequence P,,(x) defined by (3.1). By (3.2) 
we know that 


P»,(0) = 0, P2,~14) = 0 (k 1,2, 3, °°). 


Substituting into (3.6) the two values x = 0 and x = 4, we conclude that g(z) — 1 
is an odd function of z, and that g(z)e*’* is an even function of z. We therefore have 
the identities 


g(z) -1 = —g(—2) +1 and g(z)e”? = g( —ze 7”. 


Eliminating between them g( — z) we obtain that 


(3.7) g(2) =<. 


1973] THE ELEMENTARY CASES OF LANDAU’S PROBLEM 129 


If we write 
(3.8) E,(x) = n'P,(x), 
then (3.6) becomes 


2e* SOE, (X) 
(3.9) +l 2 “nl 


This expansion shows that the E,(x) are the classical Euler polynomials. (See [9], 
[1, Chapter 23] also for further references.) Combining (3.1) and (3.8) we obtain 


(3.10) f(x) = E,(x)/n! inOSx $1, 

and therefore, by (2.8) and (2.9), that 

(3.11) E o,-1(X) = En, 1(x)/Ex,- (0) inOSx $1, 
(3.12) Eox(X) = En (x + 3)/En.(4) mn —-2 SX S35. 


The author could trace the spline function n! f(x) to Nérlund’s book [9, §16] where 
it is denoted by £,,(x), and where there are references to much earlier work by Hermite 
and Sonin (1896). 

In concluding this section we mention the relations 


(3.13) lim L,(x) = 
and 
(3.14) lim &,(x) = cos 7x, 


both of which hold uniformly for all real x. Concerning (3.13) see [12]. The relation 
(3.14) follows, via (2.8) and (2.9), from the beautiful Fourier series expansion of 


Ful): 


It. LANDAU’S PROBLEM FOR R = (— 00,00). KOLMOGOROV’S THEOREM 


4. Statement of Kolmogorov’s theorem. Let nm 2 2. We consider here the 
class of function f(x) from Rto R that are bounded and have a bounded nth deri- 
vative f(x). This last condition needs some further explanations as follows: 
In the first place we assume that 


(4.1) f(x) eC" (R) 
and that 
(4.2) f(x) is piecewise continuously differentiable. 


We interpret (4.2) to mean that the graph of f“~ (x) has a continuously turning 


130 I. J. SCHOENBERG [February 


tangent, except for corners with finite slopes for their right and left tangents, and 
that every finite interval contains at most a finite number of such corners. Finally, 
of course, f(x) is to be bounded for all real x. 

Evidently, the Euler spline &,(x) satisfies all these conditions. In fact we have 
already considered the norms (2.11) of its derivatives and Lemma 7 shows how to 
identify, by (2.12), the values of 


(4.3) Yay = [eM |, @ =0,1,--57), Yn,o = 1. 
THEOREM OF KoLmoGorovy. Jf f(x) is such that 

(4.4) If] <4. [FO] Sonn 

then 

(4.5) ||| Sm for v = 1,2,-,n — 1. 


The constants y,, in (4.5) are best constants because the Euler spline &,(x) 
satisfies (4.4) and furnishes the equality sign in (4.5), simultaneously for all values 
of v. Complete proofs of this theorem, in the order of their appearance, are found 
in [4], [2], and [13]. As the title of this paper indicates we shall establish here only 
the cases n = 2, n = 3, and will indicate the general method of attack used in [13] 
by remarks concerning the problem for n = 4, v = 1, in §10. 

In order to formulate the special cases that are to be established, we need the 
numerical values of the corresponding y, ,. From (2.8), (2.9), or by determining 
f2(x), f3(x), f4(x) directly by successive integrations from (2.3), we obtain 


. 1 1 
(4.6) & 4(x) =1— 4x? In | - 35) 
(4.7) 63(x) = 1 — 6x? + 4x° in [0, 1], 
24 16 1 1 

8 = 1-—x?4+—x* in | —~, =}. 

(4.8) E4(x) = 1 5x + ext in | 55 
Using (2.11) and (2.12), we find that 
(4.9) V2.0 = 1, Y2.1 >= 4, 2,2 > 8, 
(4.10) ¥3,0 = 1, ¥3,1 = 3, Y3,2 = 12, ¥3,3 = 24, 
16 48 192 384 

(4.11) Yao = 1, Yan = 5° 4,2 = 5° 4,3 = 3 4,4 = 5 


The first three cases of Kolmogorov’s theorem may now be spelled out as follows. 
THEOREM 1 (Landau). If f(x) is such that 
(4.12) lf] $1, |e" S38 


1973] THE ELEMENTARY CASES OF LANDAU’S PROBLEM 131 


then 
(4.13) 

THEOREM 2. (G. E. Silov). If f(x) is such that 
(4.14) If] st |e] S24, 
then 
(4.15) , | F’ |] s 12. 


THEOREM 3. (G. E. Silov). If f(x) is such that 


384 
4.16 sist [ops 
then 
(4. 17) f" UL <=. 


For a reference to Silov’s work see [4]. 


5. A kinematic interpretation: 1. [t seems suggestive to think of x as time and 
of f= f(x) as describing the motion of a point on the f-axis. The first inequality 
(4.12) means that the point f is forever moving on the segment —1 <f< 1. The 
second inequality (4.12) requires that the acceleration in absolute value should 
never exceed 8 cm/(sec)”. The conclusion (4.13) states that the velocity will never 
exceed 4 cm/sec. We know that this value is reached for the motion f = &,(x) 
which is periodic of period 2 cm (Figure 1). Likewise (4.14) means that the rate of 
change of the acceleration in absolute value is not to exceed 24 cm/(sec)*. The 
conclusions concerning the velocity and acceleration are then described by the 
inequalities (4.15). 


2. Let us consider the simple harmonic motion 
(5.1) f = sin@x, (@ positive constant). 


By differentiation we find that 
(52) Is] = 1 


We inforce (4.12) in the most advantageous way by choosing w such that w? = 8, 
hence w = 2,/2 = 2.83. Thus | f” || = 8, while 
optimal value 4 given by (4.13). 

Assuming (4.14) and choosing w* = 24, hence @ = 22/3 = 2.88 we find from 
(5.2) that | f’ || = @ = 2.88, ? = 8.29, which are short of the optimal 
values 3 and 12, respectively, as given by (4.15). 


" Mm 


| = 0. 


6. A general formulation of Kolmogorov’s theorem. Let F(x) be a bounded function 


132 I, J. SCHOENBERG [February 
having a bounded nth derivative and let 

6.1) |F | =Mo, [Fl = M,. 

What upper bound can we find for 
(6.2) | FS 


= M,, (0<v<n)? 


The best bound for M, is easily found as follows: Let a and b be positive constants 
and let 


(6.3) F(x) = aF(bx). 
We shall now determine a and b such that /(x) satisfies the conditions 
(6.4) IF = 1, [FO] =n 


Differentiating (6.3) and using (6.1) and (6.2), we find that 
6.5) [F| = aM, | sl] =ab'm,, |r 


To insure (6.4) we determine a and b from the equations aM, = 1 and ab"M, =), , 
and find the values 
(6.6) a = Mo", b = yn Mo"M, 


= ab"M,. 


For these values 
(6.7) | || = ab’M, = Mo* yn Mol"M, ”"M,. 


The relations (6.4) show that f(x) satisfies the assumptions (4.4) of Kolmogorov’s 
theorem. We may therefore apply its conclusion to the effect that ||| <»,,,. 
Using (6.7) we find that the following statement holds. 


KOLMOGOROV’S GENERAL THEOREM. The suprema (6.1) and (6.2) satisfy the 
inequality 


(6.8) M,SC,y° M37! mM", where C,, = YnvVann (O<v<n). 


Notice that the factor C,, , is a numerical constant depending on n and vy, and that 
it is the best constant because we obtain equality in (6.8) for the function F(x) ob- 
tained from (6.3) if we set there f(x) = &,(x). This function is 


F(x) = a~*@,(b~ 'x), 


where a and b have the values (6.6). 
Using the values (4.9) and (4.10), the inequalities (6.8) become 


for n = 2: M, S$ 212MM}? 
and 
forn=3:M,S (27 132/3) 42/3 yg! M, 


IA 


1/3 1/3 4472/3 
37/9 M63 M2/9, 


1973] THE ELEMENTARY CASES OF LANDAU’S PROBLEM 133 


7. A few approximate differentiation formulae. Our immediate objective is to 
establish Theorems 1 and 2. For this purpose we assemble here a few simple tools. 


LEMMA 8. The following identities hold for functions f(x) having appropriate 
derivatives which are integrable: 


(A) fa) = fA) —f) + [, K(x) f"(xdx, 
where 

x ifOsx ss, 
(A’) K,(x) = 

x-lif4<xsl. 
(B) r'Q) =f) —F + { Kx) f"(x)dx, 
where 

— 4x? fosxs34 
(B’) K,x) = | 

— 4(x — 1)? if+<x< 1. 

1 
© fO = f0)-27O += H+ [ KsCOMedx, 
where 

Kx+l)? if -1<x<0 
(c’) K,() = | 
—k(x-—1)? if O<xsl. 


Proof: These formulae belong to those elementary parts of numerical analysis 
which deal with the approximate performance of the operations of Calculus (inter- 
polation, differentiation, integration, a.s.f.). The fundamental tool in this field is 
Taylor’s formula with Cauchy’s integral remainder 

(t _ a)" * 


(7.1) SO) = f(a) + (0 AFA) Fo + SO) 


+ (ni)! : Di [ (t — x)" * f\™(x)dx. 


It is derived by integrating by parts the remainder n times. 
A. We apply (7.1) for n = 2, a = 1/2 and the two values t = 1 and t = 0, 
obtaining 
1 


fi) = fG) +4/'@) + | (1 — x) f"(x)dx 


1/2 


f0) = f@) -4f'@) + | (eared 


134 I. J. SCHOENBERG [February 


Subtracting we get 
1/2 1 
FQ) —fO) =f'@) — [ xf"(x)dx — | Ae — I) f"(x)dx 
and this agrees with (A), (A’). 
B. Observe that K(x) is continuous and that 
K,(x) = — K,(2). 


We may therefore integrate by parts the remainder of (A) to obtain 


| K(x) f"(@dx = — [ f'"()dK,(x) = [ K(x) f"(x)dx, 


because K,(0) = K,(1) = 0. This establishes (B) and (B’). Alternatively, we apply 
(7.1) for n = 3, a = 1/2 and the two values t = 1 and t = 0, and _ subtract the re- 
sulting relations. 

C. Apply (7.1) for n = 3, a = 0 and the two values t = 1 and t = — 1 to obtain 


Fl) = FO) + FO) + Ff") + i { (1 — x)? f"(x)dx, 


A-D=fO-F'O+4/O+4[ (- 1-9 Pde. 


Adding these we get 
0 1 
FQ) — 2f00) + fC — 1) =F") -— 2 [ (x + 1)? f"(x)dx + 4 { (x — 1)°f"(x)dx 


which is identical with (C) and(C’). 


8. Proofs of Theorems 1 and 2 and their extremizing functions in the strict sense. 
Let us establish Theorem 1 (§4): We consider the function 


(8.1) fo(x) = — &2(X). 
From (4.6) and Figure 1 we see that it has the properties 

8 in (0,4) 
— 8 in (4,1). 


Applying the differentiation formula (A) of §7 to f(x) we find by (8.2) and the 
explicit form (A’) of the kernel K ,(x) that 


(8.2) — fo(0) = — 1, fol) = 1, fold) = 4 and fo(x) = } 


(8.3) 4= fog) = 14148 [ | K,(x)| dx. 


Let f(x) be any function satisfying (4.12) and let us evaluate f’(4) by the for- 


1973] THE ELEMENTARY CASES OF LANDAU’S PROBLEM 135 


mula (A). Moreover, we may assume that f’(4) 2 0, for if f’(4) <0 then we could 
replace f(x) by — f(x). We now obtain 


(8.4) (0 S)f'G) =£01) —f0) + [ K,()f"()dx S14+14+8 |, | K(x) | dx 


the last inequality being a consequence of (4.12). Moreover, the last member is 
equal to 4 by (8.3) . Therefore 


(8.5) If'@| s 4. 


This implies that | f'(%o)| S 4 no matter what xo may be. For also f(x + x9 — 4) 
satisfies all assumptions and applying (8.5) to it, we find that | f’(xo)| <4. O 
Let us assume now that 


(8.6) f'(@) =4 


and see what the consequence are. Evidently (8.6) holds if and only if we have the 
equality sign in (8.4). Also, again in view of the conditions (4.12), we have equality 
in (8.4) if and only if f(x) satisfies the conditions 


8 in (0,3) 
(8.7) f0) = -1,/0) = 1, 7") = } 
—8 in (4,1). 
Moreover, f(0) = —1 and ||| $1 imply that f’(0) =0 and f(1) = 1, with 
| f|| <1, imply that f’(1) = 0. It clearly follows from (8.6) that 
f(x) = —&,(x) in (0,1). 


We state this result as 


THEOREM 4. If 


6.8) I”lsu leis 

and 

(8.9) F'() = 4, 

then 

(8.10) F(x) = — &,(x) in the interval |0,1]. 


Outside the interval [0,1] there is little that we can say about the function f(x) 
satisfying (8.8) and (8.9). Indeed, notice that there are many ways in which the 
function (8.10) can be extended to all reals and still satisfying (8.8) (of course with 
the equality sign in both inequalities). For beside the obvious extension 


(8.11) f(x) = — @,(x) for all real x, 


136 I. J. SCHOENBERG [February 
we can also write 
1 ifx>1 
(8.12) f(x) =< —@,(x) if OS x <1, 
—] if x < 0, 
and many similar modifications of the function (8.11). 
A comment on the function f(x) satisfying (8.8) and (8.9) is in order. We can 


call f(x) an extremizing function in Theorem 1 because f(x) satisfies (4.13) with 
the equality sign, hence 


(8.13) | f’ 


Moreover, we wish to call this f(x) an extremizing function in the strong sense 
because the supremum of | f ‘(x)| ( = 4) is actually assumed for a real x, viz. x = 4. 
We shall see in §9 that there are numerous extremizing function (x) in Theorem 1, 
hence satisfying (8.13), such that 


(8.14) | f’(x)| <4 for all real x. 


= 4, 


Such functions may be called extremizing functions in the weak sense. 
Let us establish Theorem 2 (§4): Let f(x) satisfy (4.14) | fl s 1, 
and let us show that (4.15) |’ || < 3, ||f” |] < 12. 
We reproduce here the second differentiation formula 


f" ! < 24, 


(B) f'@ =f) — f0) + [ K.coreoas 

of Lemma 8 and apply it to the function 

(8.15) fox) = — 63(x) = — 1+ 6x? — 4x? in [0,1]. 
This function has in [0,1] the properties 

(8.16) fo(0) = —1, fol) = 1, f(x) = — 24. 


In view of (B’), of Lemma 8, we know that K,(x) <0 in (0,1), and from (B) we 
derive 


1 
(8.17) 3 = fi4) =1+1+ 24 | | K(x) | dx. 
0 


If f(x) is any function satisfying (4.14), let us evaluate /’(4) by (B), assuming that 
f'G) 2 90 (otherwise we take — f(x)). We obtain 


G18) OSH =/M-f0 + | KGoPodx s1+1+24[ | Kio] dx =3, 


by (8.17) and the first inequality (4.15) is thereby established. 


1973] THE ELEMENTARY CASES OF LANDAU’S PROBLEM 137 


At this point we interrupt our proof of Theorem 2 in order to see what we can 
say about /(x) if 


(8.19) f'() = 3, 


i.e., if equality holds in (8.18). From (4.14) we see that we have equality in (8.18) 
if and only if f(x) has the properties 


(8.20) f(0) = —1, f(1) = 1, and f"(x) = — 24 in (0,1). 


However, as before, we also have 
(8.21) f'(0) =f") =9 


and, of course, (8.19). The conditions (8.19), (8.20) and (8.21) are more than sufficient 
to imply 


(8:22) F(x) = — &3(x) in [0,1]. 
Let us record here this result as 


COROLLARY 1. If f(x) satisfies (4.14) and (8.19), then (8.22) also holds. 
We now wish to establish the second inequality (4.15): For this purpose we 
need the third formula 


1 
© £"0) =F) = 2FO) + — N+ | Ks M@dx 
of Lemma 8. We recall that by the formula (C’) of that lemma the kernel has the 
properties 
(8.23) K(x) > 0 in (— 1,0), K3(x) <0 in (0,1). 
We now apply (C) to the function 
| —1+6x? + 4x? in [ — 1,0] 
(8.24) folx) = — 83(x) = a. 
—1+6x* — 4x” in (0, 1]. 
This function has the properties 
, 24 in ( — 1,0), 
(8.25) fo( — 1) = 1, fo(0) = —-1, AO) = 1, fo (x) = 
— 24 in (0,1), 
and (C), (8.23), and (8.25) show that 
1 
(8.26) 12 = fo(0) =1+2+1+ 24 | | K;(x)| dx. 
—1 


If f(x) is any function satisfying (4.14), and assuming that f”(0) = 0, an application 


138 I. J. SCHOENBERG [February 


of (C) shows that 


0 S)F"@ =F) —2f) +H(— D+ | KsCOs"@as 
(8.27) 1 
< 1+ 241424 | | K3(x)| dx = 12 
—1 


by (8.26). Applying this result to f(x +x 9) we obtain that | #’"(Xo) | < 12, and 
Theorem 2 is established. [] 
Let us assume that f(x), satisfying (4.14), is such that 


(8.28) f£"(0) = 12, 


and let us examine the consequences of this assumption. Clearly (8.28) if and only if 
we have the equality sign in (8.27) and this turn holds if and only if 


. 24 in (- 1, 0), 
f(-) = 1, f(O) = —1, fC) = 1, and f"(x) = 
— 24 in (0, 1). 


From this we conclude that 


(8.29) f(x) = —@3(x) in -1 Sx S81. 

We have therefore established the 
COROLLARY 2. If f(x) satisfies (4.14) and (8.28) holds, then also (8.29) holds. 
The following generalization follows by a change of origin: 


COROLLARY 2’. If f(x) satisfies (4.14) and is such that 


(8.30) f(a) = +12 
then 
(8.31) f(x) = $€@3,(x-—a) ifa-1lsxsSartl. 


We may now state our 


THEOREM 5. If (4.14) | f| <1, | f° | < 24 and if in one of the inequalities 
(4.15) | f’ < 3, | 7" < 12, we have the equality sign, the corresponding sup- 
remum being actually attained, then 


(8.32) F(x) = €3(x —c) for all real x, 


for an appropriate constant c. 


Proof: 1. Let us assume that |’ 
loose no generality by assuming that 


= 3. This supremum being assumed, we 


1973] THE ELEMENTARY CASES OF LANDAU’S PROBLEM 139 


(8.33) f'(® =3. 
Now Corollary 1 implies that 
(8.34) F(x) = — &3(x) in [0,1]. 


€3(x) 


Fic. 2. 


This in turn shows that f"(1) = — 12 (see Figure 2) and now Corollary 2’ shows 
F(x) = — &3(x) in [0,2]. But then surely f”(2) = 12 and Corollary 2’ implies that 
J (x) = — &3(x) in [1,3]. We can continue in this way indefinitely and conclude 
that f(x) = — &3 (x) for x 2 0. However, the same reasoning works also to the 
left: From (8.34) we conclude that f"(0) = +12 and therefore (8.34) holds also in 
[—1,1], hence f’"(—1) = —12 and (8.34) holds in [ — 2,0] a.s.f. Therefore 
F(x) = — &3(x) = &3(x — 1) holds for all real x. 

2. If we have equality in the second inequality (4.15), we get the same conclusion 
by applying only Corollary 2’. 1 


9. The extremizing functions in the weak sense. In the present section we 
discuss only the cubic case of n = 3. Our last Theorem 5 has answered the question 
as to when we have the equality sign in one of the inequalities (4.15) for the case 
when the respective supremum is actually attained. 


DEFINITION 4. We say that f(x) is an extremizing function in the weak sense 
for n = 3, provided that f(x) satisfies the inequalities 


(9.1) If <1, 7" |] s 24, 
and therefore also 
(9.2) 7’ | $3, | f" | S12, 


with the equality sign in one of the inequalities (9.2), the corresponding supremum 
not being attained. 

This definition raises the following questions: 

QUESTION 1. Do extremizing functions in the weak sense exist? 

QUESTION 2. Let us suppose that they do and let f(x) be one such. Does then 
the equality sign hold in all four inequalities (9.1), (9.2)? 


140 I. J. SCHOENBERG [February 


We shall see that the answers to both questions are affirmative. 
The affirmative answer to Question 1 is contained in 


THEOREM 6. There exist functions f(x) such that 
0.3) If = 4, FY = 3, [ee = 12 [P= 24, 
while 
(9.4) | f@)| <1, [F'0)| <3, [7’@)| < 12, [£")| < 24 for all real x. 


Proof: We know that /(x) = &3(x) satisfies (9.3), but not (9.4). To enforce 
both (9.3) and (9.4) we let the function “‘sag between — oo and + 00”’ by passing 
to the new function 


(9.5) F(x) = &3(x)@(x) 


with an appropriate positive function (x) to be constructed. 
Using the known values (4.10) of y3, = | 6” | we derive from (9.5) the ine- 
qualities 


|f(x)| = | &| S o@), 
|f'(x)| = | &'o + &d'| S 30%) +| o'@)], 
[f'(x)| = | 8b + 26'b' + &b"| S 12G(x) + 6] '(*)| +| 6’, 
7"x)| = |8"b + 36’ + 38'b" + 6G" | < 246(x) + 36| o'(x)| 
+9] 6(x)| +| 6") ]. 

We shall therefore satisfy (9.4) if @(x) is positive and such that 

¢ <i, 

36 +|¢'| <3, 

12¢ + 6| d’| +| ¢”| < 12, 

24 + 36| 6’| + 9] 6”| +|¢”| < 24, for all real x. 


These amount to 


(9.6) 


1—@>0, 

1 
1-o>3|¢'|, 

1, ,, 1, ,, 


3 3 1, 
1-$>5|¢'| +3/¢"|+541¢"|- 


1973] THE ELEMENTARY CASES OF LANDAU’S PROBLEM 141 


Observe that the last inequality implies the previous ones. It suffices therefore to 
require that (x) be positive and to satisfy the differential inequality 


3 3 ’ 1 m ' 
(9.7) 1 — (x) > 51¢'(x)| + 3 o"(x)| + aa o"(x)| for all real x. 
In order to insure also the equations (9.3), it is clear that o(x) should also satisfy 
the boundary conditions 
(9.8) P(x) > 1, (x) > 9, G(x) 70, O"(x) 70 as x> + ~. 


Indeed, Leibniz’s formulae (see (9.6)!) and the periodicity of &(x) will then show 
that 


[f° = lim [oO] = |e], @ = 0,1,2,3). 
x—7+00 


Let 
(9.9) W(x) = 1—e ”, (y positive constant). 
A simple calculation shows that w(x) will surely satisfy (9.7), provided that 


33, 1, 
(9.10) 1>S9 +37 +54? 


for which 0 <y < 1/2 will certainly do. We now define 
l-—-e” ifx21, 
(9.11) oe) = | | 
o(-—x) ifxs -1. 


Assuming (9.10), this function satisfies (9.7) outside the interval ( — 1,1). Moreover, 
g(x) is positive and also satisfies the boundary conditions (9.8). 

There remains to bridge the gap between — 1 and 1 and this we do by interpolation 
as follows, Let 


(9.12) P(x) = A + Bx? + Cx* 
and 
(9.13) g(x) = P(x) in -1 5x1. 


We also require P(x) to satisfy the interpolatory conditions 
(9.14) PC!) = w(1), P’) = w’(), P"() = Ww"). 


The functions P(x) and (x) being both even, it is clear that the requirements (9.14) 
will insure that @(x) € C’(R). 

We are yet to insure that P(x) is positive and satisfies (9.7) in [0,1], and therefore 
also in [ — 1,1]. From (9.14) we easily get for the coefficients of P(x) the values 


142 I, J. SCHOENBERG [February 


5 1 _ 1 _ 1 _ 
(9.15) A=1-( teytere ’,B = 776 + Ye 1,C= — 70 + ye’. 


1. The positivity of P(x) in [0,1]: Dropping the positive term Bx” 
P(x) = A+ Bx? +Cx*>A+C 


5 1 _ 1 - 
= 1-(l+ oy + 9y°)e "— 371 + ye ™>0 


because the last inequality is equivalent to e’ > 1 + £y + 2y?, which evidently holds. 
2. P(x) satisfies (9.7) in [0,1]: We are to find y such that 


3 3 1 
(9.16) 1— A — Bx? — Cx* >5| 2Bx + 4Cx?| +5|2B + 12Cx?| + 57| 24Cx| 


holds in 0 S$ x S$ 1. Dropping on the left the positive term — Cx*, cancelling the 
common factor e ’, and taking on the left side all terms with their negative values 
for x = 1 and on the right with their positive values for x = 1, we easily find, after 
rearrangements that the inequality (9.16) is surely satisfied if the inequality 
1 > £(35y + 25y?) holds. This is the case if 


(9.17) 0<7<z. 


To summarize: Let y satisfy (9.17) and P(x) be defined by (9.12) and (9.15). 
Finally let 
1-e7Fl if |x| 21 
Q(x) = 
P(x) if -1l<x<l. 


Then f(x), defined by (9.5), satisfies the conditions (9.3) and (9.4) of Theorem 6. [1] 
The second question is answered affirmatively by 


THEOREM 7. Let f(x) be such that 
0.18) list [rs 
and therefore 
0.19) In }ss [rise 


If the equality sign holds in one of the inequalities (9.19), then the equality sign 
holds in all four inequalities. 


Proof: 1. Let us assume that 
(9.20) | 7’ || = 3. 
If the supremum | Sf’ | is assumed, then we know by Theorem 5 that (8.32) holds 


1973] THE ELEMENTARY CASES OF LANDAU’S PROBLEM 143 


and we are through. We may therefore assume that 


(9.21) | I '(x) | <3 for all real x. 
Let (x,), (v = 1,2,-:+), be a sequence of points such that 
(9.22) lim f’(x,) = 3, 


the reasoning to be applied being similar if this limit should be — 3. It should be 
clear that the sequence (x,) can not have a finite limit point €, for we could then 
conclude from the continuity of f(x) that f’(¢) = 3, in contradiction to (9.21). 
We may therefore assume that x, > +00, or perhaps — oo. Let us assume that 


(9.23) lim x, = +0. 


v7 oo 


By the formula (B) of Lemma 8 we may write 
(9.24) f'(%) =foy+)—-fO,-P + [ K(x) f"(x + x, — Ddx, 
while (8.17) shows that 
(9.25) =1+1+ | : K,(x)| - 24dx. 
From (9.22) we conclude that 


1 
fey +9 —f0q-H + [| KiG)|{- fe +x, -Diae 
(9.26) 
141+ [ [KG] 24dx as v— OO. 
0 


From this relation we shall derive all that we need. 
We observe first that 


1 1 
Flory +4) ~ Fe — 3) + J [Ki] {sax > 141+ [| Ky] -2ddx—c 


if vy > N(e), while 


1 1 
{ | K,| 24dx = | | K2| { —f"}dx 
0 0 
and 1 = — f(x, —4) hold anyway. Adding these three inequalities we find that 


S(x%, +4) >1-e if v > N(e) and therefore 
(9.27) lim f(x, +4) = 1. 


yo 


144 I. J. SCHOENBERG [February 


Similarly we find that 
(9.28) lim f(x, — 4) = —1, 


va 


and finally, from (9.26), that 


lim |, | K2(x)| {—f"(x +x, —4)}dx = fi K,(x)| » 24dx. 
v7o Jy 0 0 


If we write 
(9.29) $x) = 244+ f/"(x+x,-P), OSxSD, 
we know that this sequence of piece-wise continuous functions has the properties 
(9.30) 0 < (x) < 48 in [0,1], 
and 
1 

(9.31) lim | | K2(x)| d,(x)dx = 0. 

v0 0 


From (B’) of §7 we know that | K,(x)| vanishes at 0 and 1, that it increases in 
[0,4] and decreases in [4,1]. Also that K,(x) = K,(1 — x), We choose « such that 
0<a<4and may write 


1 1~a 
(9.32) | | K2(x)| d,(x)dx = | K2(@)| | b(x)dx Z| Ka(«)|- inf d,@), 


Now (9.31) implies that 
(9.33) inf $,(x)-0 as v— oo, 


{e,1—a] 
Selecting €, in [a,1—a] such that ¢,(¢,) <inf¢,(x) +2°", we conclude from 
(9.33) that ¢,(¢,) > 0. Finally, returning to f” by (9.29) we have shown that 


(9.34) lim f"(é, +x, -—}) = —24. 


yo 


Evidently (9.27), or (9.28), and (9.34) show that 


(2.35 If =1, [sr] =24 
There remains to show that 
(9.36) | f" | = 12. 
From (9.31) and (9.32) we conclude that 
1-a 


lim o,(x)dx = 0. 


vy © @ 


1973] THE ELEMENTARY CASES OF LANDAU’S PROBLEM 145 


However, this integral can be evaluated by (9.29) and we obtain 
24(1 — 2a) + f"(ny) —f'"Gy) > 9, 


where n, = 1—a+x,—4,¢, =a+x, —4. Therefore f"(€,) —f"(n,) > 241 — 2a) 
as v—oo, hence f"(€,) —f"(n,) > 24 — 48a —e if v > N(e). Adding to this the 
inequality f"(n,) 2 —12 we obtain that f"(¢,) > 12 — 48a —e if v > N(e). Since 
€,— + 0 and « is arbitrarily small, we conclude that 
lim f"(x) = 12. 
x7 +00 
This, together with | f” || < 12, shows that (9.36) holds. 

2. A similar method, this time using the approximate differentiation formula (C) 
of Lemma 8, allows to show that (9.36) implies the equality sign in all other ine- 
qualities (9.18) and (9.19). However, we omit further details. [1 

3. There are theorems analogous to Theorems 6 and 7 for the case when n = 2 
and they are easier to derive. Also for n = 2 there are extremizing functions in the 
weak sense, i.e., satisfying (8.8), (8.13), and (8.14). The details may be left to the 
reader. 


10. How is Theorem 3 established? As its title indicates, this paper 1s devoted to 
the elementary cases of Landau’s problem. However, Theorem 3 is no longer an 
elementary case. The ideas underlying its proof are just as simple as before, but the 
necessary tools, i.e., the required approximate differentiation formulae, are more 


complicated. 
Let us sketch, with a minimum of detail, a proof of the first inequality 
(10.1) | f’ | < 16/5 


of Theorem 3, assuming that 
(10.2) lf] <1, [4] s 384/5. 
The approximate differentiation formula that we need is 


f'G) = wf) + wAf2) + wa? f(3) + 


(10.3) ~ 

= uf) = wif 1) = ni2 f= 2 + [KON PGMs, 
where 
(10.4) p= a4 Py = 1.14534, 2 = —11 +2,/30 = —.045 548. 


The kernel K(x) is a cardinal cubic spline, i.e., having its knots at the integers, 
except that at x =1 it has a discontinuity in its second derivative. It satisfies 
K(x) = — K(1 —x) and is therefore odd about the point x = 1/2. K(x) decays 


146 I. J. SCHOENBERG [February 


exponentially as x > + o so that K(x) is absolutely integrable on the real axis. 
Moreover 


(10.5) K(v +4) =0 for all integer v, 
and K(x) vanishes nowhere else. Finally 
(10.6) K(x) <0 if -—4<x<4 


and it changes sign at each v + 4 (see Figure 3). 


If we substitute into (10.3) the function 


fo(x) = — &4(x), 
we find that K(x) f$(x) is positive for all x, except that it vanishes if x = v + 4 
by (10.5). Since f{*(x) = + 384/5 we obtain the result 


+s 
__ (Ke ) | dx. 


16 (i 
(10.7) —=fo (5) = 2 p> [a|” + 
5 2 
If f(x) is any function satisfying (10.2), and assuming that /’(4) 2 0, we obtain 
from (10.3) and (10.2) the estimate 


384 ” | Key|ae = 18, 


O0<f' (;) <2 E ||" + 
by (10.7). By reasonings used before, the equality sign is seen to hold only if 
f(x) = — &,(x). This establishes (10.1), except that we have not proved the 
identity (10.3), nor do we propose to do so. However, let me say the following: 
The formula (10.3) is exact, i.e., its remainder vanishes, whenever f(x) is a cubic 
polynomial. This clearly does not characterize the formula. However, (10.3) can be 
shown to be exact if f(x) is a cardinal cubic spline with knots at v + 4 that grows 
at most like a power of | x| as x > + 00, and this condition characterizes the formula 
(10.3) and allows to derive it. . 

A last remark: The question arises whether (10.3) could be replaced in the above 
application by some appropriate finite formula that involves only finitely many 
of the ordinates f(v). The answer is no: It can be shown that no finite differentiation 
formula exists that will serve the same purpose. For further details we refer to [13]. 


1973] THE ELEMENTARY CASES OF LANDAU’S PROBLEM 147 


Ill. LANDAU’S PROBLEM FOR R,. = [0, 00). 


11. The case n = 2. Landau’s problem for the halfline R, is similar to the 
problem solved by Kolmogorov’s theorem, the difference being that now the com- 
petition is open only for functions from R, to R. Accordingly, the role of the pre- 
vious norm | f | is now taken over by the halfline norm 


(11.1) | f+ = sup|f(x)| for x 2 0. 


To facilitate the comparison with the results of Part II, we choose the same normal- 
ization as in Kolmogorov’s theorem, namely 


(11.2) f+ St. [FO] + S mas 
the objective being to find within this class those functions that maximize the norms 
(11.3) |7@ 4, for v = 1,2,-+,n-1. 


The transition to other normalizations, such as the one used in [11], can be achieved 
by means of the trivial transformation (6.3) used in §6. In §13 the R,-analogue of 
Kolmogorov’s theorem will be mentioned. In the meantime we turn to the first of 
the two elementary cases of the problem. 

We assume that 


(11.4) [f+ 34 sf" [4 S88 
and seek a function /o(x) such that | ho | +2 | f' | , for all functions f(x) satisfy- 
ing (11.4). 


The function &,(x) satisfies (11.4) and we also know that || &, ||, = 4 (this is 
where the constant 4 of Theorem 1 came from). Now we can do better! Indeed, 
let us consider &,(x) for x 2 — 4, and let us remove the knot at x = —4 and 
continue the quadratic y = 1 — 4x? (see (4.6)) also for values of x < —4, until 
we reach the point where the parabolic graph of y = 1 — 4x intersects the horizontal 
line y = — 1. We find that this happens for x = — 1/,/2. We consider the function 


1—4x? if —1//2<x S0, 
as) = | : 


é,(x) ifOSx< o, 


and shift the origin to — 1 ./2 to define 

SX) = g(x - 1/,/2) for x 2 0, (see Figure 4). 
Clearly 
(11.5) | fo | 4 =1, || fo | + = 8. 


However, it should also be clear from Figure 4 that | fo || , lis reached by | fo(x) | 
for x = 0so that 


| fo |+ =o = 9'(- 1/72) = — 8x]. =-1)y2 = 4/2 = 5.65684. 


148 I. J. SCHOENBERG [February 


Fo(x) 


Fic. 4. 


Therefore 


(11.6) | fo |+ = 00) = 4/2. 
+ S 4, asin Theorem 1, 


We see that the conditions (11.4) no longer imply that |/f’ | , 
but allow considerably larger values such as || f’||, = 4,/2. This, however, is the 
largest value, a fact which we state as 


THEOREM 8. (Landau’s theorem). If 


(11.7) Ifl4 Sb |s'|+s8 
then 
(11.8) lf’ 4 < 4/2. 


Here 4,/2 is the best constant because it is reached for the above function f(x). 


Proof: As in the case of Theorem 1, we need a differentiation formula. By 
Taylor’s formula (7.1), for n = 2,a = Oand t = 1/,/2, we have 


_ _ 1//2 _ 
fALYD) =f(0) + A /DF"0) + | ~ (liy2 =) f"@)dx 


Solving for f’(0) we obtain 
_ _ _ 1//2 _ 
(19) £0) = V2f0IVD— V2 — [A= xVF Pend. 
Applying this to fo(x) we find that 
1//2 _ 
(11.10) 4,/2 = fo(0) = /2+./2+8 [ (1 — x,/2)dx. 


If f(x) is any function satisfying (11.7), and assuming f’(0) = 0 (else we work with 
— f), and estimating /’(0) from (11.9) and (11.7) we obtain 


1973] THE ELEMENTARY CASES OF LANDAU’S PROBLEM 149 


1 
0 


8 [2 _ _ 
0S /f'0) S$ /2+/2+8 [ (1 — x./2)dx = 4,/2 
by (11.10). Therefore 


(11.11) | f'(0)| < 4/2. 


However, if f(x) satisfies (11.7), also f(x + Xo), with xo 2 0, will satisfy (11.7). 
We may therefore apply to f(x + Xo) our previous conclusion (11.11) to infer that 
| f’(xo)| S 4/2. Therefore | f’ |], < 4/2. O 

We turn now to the extremizing functions. Unlike the situation discussed in §9 
(for n = 3) there are no extremizing functions in the weak sense in the present case 
of R,. In fact we have the following very precise theorem. 


THEOREM 9. If f(x) satisfies (11.7) and if 


(11.12) |’ 4 = 4/2 
then 
(11.13) f(x) = + fo(x) in the interval 0 < x S 1/,/2. 


REMARK. Beyond (11.13) there is little that can be said about the extremizing 
function f(x). Indeed, the function (11.13) can be continued from 1/./2 to + cin 
various ways, such as f(x)=+1ifx> 1/./2, or else by f(x) = + €@,(x — (1/./2)), 
without violating the basic condition (11.7). 


Proof: We distinguish two cases depending on whether the supremum ||’ || , 
is attained or not. 
1. Let us assume that it is attained and that 


(11.14) f'(® = 4/2, 
for if this value were — 4,/2 we could work with — f(x). Let us write 
(11.15) K(x) = 1—x,/2, (0S x S 1/,/2) 


for the kernel in the formula (11.9). By (11.9) and (11.10) we conclude that the 
equation (11.14) is equivalent to 


_ 1° _ 1//2 
Jaf (: + 5) VBI - [. K(x) f'"(x + dx 
(11.16) ; ‘2 
— J2 + J2+8 [ K(x)dx. 


Because K(x) is positive in [0, 1/,/2), (11.16) and (11.7) imply that 


150 I. J. SCHOENBERG [February 


(11.17) f() = - L#(é + 5) = 1, and f"(x+4) = —8 in(0,5). 


Clearly € = 0, for if & were positive, then f(¢) = — 1 and |||, < 1, would imply 
that f’(¢) = 0, in contradiction to the assumption (11.14). Now (11.17) reduce to 
f0) = -1, flV2) = 1, f"@) = —8 in (0,1/,/2), 


and this already implies that f(x) = —1+4,/2x —4x? =fo(x) in [0,1/,/2]. 
Therefore (11.13) is established for this case. 
2. Let us assume that 


(11.18) | f'(x)| <4,/2 for x = 0, 


and let us show that this can not possibly happen. 
Indeed, the assumption (11.12) implies the existence of an infinite sequence 
(x,) of points of R,, such that 


(11.19) lim f’(x,) = 4/2, 

where on the right we have chosen the positive sign without loss of generality. 
On the other hand we have the following: If 

(11.20) x24, 


then the formula (A) of Lemma 8, the relation (8.3), and the assumptions (11.7) 
show that 


| = [fe +)-s0- + | Kose +x Dat 


<$<14+1+8 [Imola =4 


Thus (11.20) implies that 
(11.21) If’(~)| S 4. 
From (11.19) we now conclude (observe that 4 < 4,/ 2!) that 


(11.22) 0 <x, $4 for v sufficiently large, v > N say. 


From the Bolzano-Weierstrass theorem we infer that the sequence (x,) has a limit 
point ¢ in [0,4] and therefore 
(11.23) lim x, = é, 

v'> a 
where v’ is an appropriate increasing sequence of integers. The continuity of f’(x) 
now implies that 


4,/2 =limf'(xy) = f'limx,) =f". 


1973] THE ELEMENTARY CASES OF LANDAU’S PROBLEM 151 


Therefore f’(£) = 4,/2, in contradiction to (11.18). The second possibility therefore 
never arises and Theorem 9 is established. [] 


12. The case n = 3. As in the previous case we retain the conditions (4.14) of 
Theorem 2 but this time for R, , hence 


(12.1) | f| + Ss 1, | £” |. Ss 24, 


and wish to find f(x) satisfying (12.1) and having the largest possible value for the 
norm | to | , of its first derivative. We also seek (perhaps another) /o(x) satisfying 
(12.1) and maximizing ||’. We shall see that one and the same function f(x) 
will do both. From (4.3) and (4.10) we know that 


é3 


(12.2) | &3 ||. = 1, és 


634 = 12, 


+ = 24, 


+ = 3, 


so that &3(x) satisfies (12.1). However, by an appropriate modification of &3(x) 
we can increase considerably the norms of f’ and /”. 

To obtain the modified function fo(x) we remove the knot x = 0 of &,(x), and 
continue its cubic polynomial branch (4.7), hence 1 — 6x? +4x*, for negative 
values of x until it intersects the line y = — 1. This happens for x = —4 and we 
define the function 


1—6x?+4x° if -4<5x <0, 
(12.3) g(x) = . 

& (x) if x = 0. 
For technical reasons we shift the origin to the point x = — 4 and define 
(12.4) So(x) = g(x — 4) (see Figure 5). 
Notice that /o(x) is a cubic spline in R, having no longer a knot at x = 4, in fact 
(12.5) fo(x) = — 1+ 9x — 12x? + 4x? if O<x < 3/2. 
Clearly 
(12.6) fo ll+ = 4, | fo'] = 24. 

x 
0 


Fic. 5. 


152 I. J. SCHOENBERG [February 


Moreover, we verify easily from (12.5) that | f(x)| reaches its largest value for 
x = 0, hence 
(12.7) | fo |+ =f0(0) = 9. 


Similarly we find that also | fo(x)| reaches its largest value for x = 0. From (12.5) 
we read off this value to be 


(12.8) || = — 6 = 24 


Comparing (12.7) and (12.8) with (4.15), we see that /o(x) surpasses by far the 
corresponding, bounds of Theorem 2. These, however, are the largest possible values, 
as stated by 


THEOREM 10. (A. P. Matorin). If 


(12.9) f+ <4 [P+ 8 24 
then 
(12.10) |f' + 39 [s+ S 24. 


In (12.10) the constants 9 and 24 are the best constants because they are reached 
by the above function f(x), (see [8]). 


Proof: We need two differentiation formulae that we get from (7.1). Applying 
(7.1) for n = 3, a = 0, and the two values t = 4 and t = 3/2, we obtain 


1) 104570447045] (5-x) Pee 
3/2 


(5) = 70+ SO+sfO+3 | (5- x) Pode. 


Solving these equations for f’(0) and f”(0) we obtain the two formulae 


3/2 


1 1 /3 
(12.11) f= 3/04 af(5)- 5f(5) +f K,Qof"edds, 


where 
4 1 
— — < < _ 
x(1 5] fOSxs>5, 
(12.12) K,(x) = 10-3); el __ , 
| 6 2 2 = 2’ 
and 


4 (3 * 
(2.19) = 57 —4F(5) +50(5) + | Kaos @oax, 


1973] THE ELEMENTARY CASES OF LANDAU’S PROBLEM 153 


where 
4/, 3 1 
— — < < _ 
[st i) ifOSxS>5, 
(12.14) K;(x) = + 3 3 . 1 _ 3 
3 x 5) 1 5 xs x 


Notice that in each of these formulae the coefficients of /(0), f (5). and f (5) 


alternate in sign and that 


(12.15) K,(x) > 0 and K.(x) <0 in0<x <>. 


We now return to the function fo(x) defined by (12.4) and graphed in Figure S. 
From Figure 5 and (12.5) we gather the following properties: 


1 3 

(12.16) fu) = - 1, fo(5) = 15 f0(5) = - 1 
(12.17) fo(0) = 9, fo(0) = — 24, 
(12.18) fo (x) = 24 in [0.5). 
Applying the identities (12.11) and (12.13) to fo(x), we obtain by (12.15) the relations 

8 1 3/2 
(12.19) fo(0) = 9 = 3 +3 + 3 + 24 [, | K4(x)| dx, 

8 4 3/2 

(12.20) —f9(0) = 24= 5 +4+ 3 + 24 | | K5(x)| dx. 


If f(x) is a function satisfying the conditions (12.9), we can estimate its derivatives 
at the origin by (12.11) and (12.13), and obtain 


8 1 3/2 
If'O| S 5 345424] | K4(x)| dx, 
0 
3/2 
tat z+] | K5(x)| dx. 
0 


[f° s 3 


The right hand sides being equal to 9 and 24, respectively, in view of (12.19) and 
(12.20), we conclude that 


|f'(0)| <9, |£"(0)| S 24. 


Applying this result to f(x + Xo), where xo > 0, we obtain (12.10). 1 
We shall now investigate the extremizing functions in Matorin’s Theorem 10 


154 I. J. SCHOENBERG [February 


and shall see‘that extremizing functions in the weak sense do not exist. We begin 
with 


LEMMA 9. 1. If f(x) satisfies (12.9) and 
(12.21) |f'(O| =9 for some & 2 0, 
then necessarily & = 0 and 
(12.22) f(x) = + folx), (x 2 9), 


where fo(x) is the function defined by (12.4) (Figure 5). 
2. The same conclusions (€ = 0 and (12.22)) hold if 


(12.23) |f"(©)| = 24 for some & = 0. 
Proof: 1. Let us first assume that € = 0 hence 
(12.24) f'(0) = 9. 


By an oft repeated argument we conclude from (12.11) and (12.19), that (12.24) is 
equivalent to the relation 
3/2 


8 1 3/2 
Ka) "(dx =3+3+3+24 [ | K4(x)|dx, 
0 


(12.25) -$70)+34(5) - 37(5) + 


0 


and that this implies that 


1 
(12.26) f(0) = -1, r(5) = |, (5) = 1, f"(x) = 24 in (0 >). 
This information already suffices to conclude that 


(12.27) f(x) = folx) in [0, 3/2]. 


But then /”(3/2) = 12 (see Figure 5). By Corollary 2’ we now conclude that the 
identity (12.27) can be extended to [1/2, 5/2]. Continuing in this manner we see that 
(12.27) holds for x 2 0. 

Let us now show that € must vanish. Indeed, if 


(12.28) €>0O and f’(¢) = 9 (say), 


then as above we conclude, as in (12.26), that f(¢) = — 1, a.s.f. But then we must 
have f'(£) =0 (or else ||f||, <1 would be violated!), which contradicts the 
assumption f’(¢) = 9. 


2. If (12.23) holds, we apply similar reasonings using formulae (12.13) and 
(12.20). © 


1973] THE ELEMENTARY CASES OF LANDAU’S PROBLEM 155 


THEOREM 11. Let 


(12.29) f+ sh se]. s 24, 

and therefore 

(12.30) |f’ |+ <9 [se], S 24. 

If the equality sign holds in one of the inequalities (12.30), then 
(12.31) F(x) = + fo(x) for x = 0, 


where f(x) is the function defined by (12.4) (Figure 5). 
Proof: 1. Let us suppose that 


(12.32) | "| 


If this supremum is assumed, hence (12.21) holds, then the conclusion (12.31) is 
already assured by Lemma 9. We may therefore assume that 


, = 9. 


(12.33) | £'(x)| <9 for x = 0, 
and let us show that this can not happen by reaching a contradiction. 

By (12.32) and (12.33), there exists an infinite sequence (x,) of points of R, 
such that 


(12.34) lim f’(x,) = 9. 


vr 


(If this limit were — 9 we could work with — f(x)). In the interval x 2 1/2 we can 
apply the differentiation formula (B) of Lemma 8, in the form 


f'Ge) = f(x +4) — f(x —4) + { K,(t)f"(t +x — Hat 


to conclude from (8.17) that |f’(x)| S$ 1+1+ 24 fo|K.()| dt = 3. Thus 
(12.35) \f'(~)| $3 if x = 4. 


Confronting (12.34) with (12.35) we conclude that 0 S x, < 4ifv > N. The Bolzano- 
Weierstrass theorem insures the existence of an appropriate infinite sequence of 
increasing integers (v’) such that 
(12.36) lim x, = €, for some € within [0,4]. 
Using the continuity of f'(x), we conclude from (12.36) and (12.34) that f’(¢) = 9, 
which contradicts our assumption (12.33). 

2. If | ft" ||. = 24, we may use entirely similar arguments. If the supremum is 
assumed, we use Lemma 9. That the supremum is always assumed is shown by 


156 I. J. SCHOENBERG [February 


contradiction as above: Formula (C) of Lemma 8 shows that | f "(x)| <$1W2ifx21 
(here we use (8.26)!), and the continuity of f”(x) takes care of the rest. ( 


13. The case n = 4 is not elementary. Our success in attacking the Landau 
problem for R, for n = 2 and n = 3 with the modified Euler splines fo(x) seems 
surprising, to say the least. However, for n = 4 this approach does not work 
anymore. To make it clear why, let us try to do it. Our problem is to study functions 
satisfying 


(13.1) lfl+ <1, [FO] S 384/5 = 76.8 
and to determine within this class the best, or least, constants Ya. , such that 
(13.2) FO [4 Srey = 12,3). 


Stated equivalently: Within the class of functions satisfying (13.1) we wish to 
maximize each of the three norms on the left side of (13.2). 
We start from &,(x). From (4.8) we know that 
24 16 


P(x) = 1 — =x? +x" = &,(x) if — 


~—_ 
~—_ 


2 =* 55 

We consider &,(x) for x = — 4 only, and remove its knot at x = —4 to continue 
the graph of the quartic P(x) for x S$ — 4. We find that it has a minimum value at 
x= - /3/2 = —.,866, where it assumes the value — 4/5, and thereafter increases 
to +00 as x — o. The new function so obtained satisfies the second condition 
(13.1). However, to satisfy also the first condition (13.1), we must cut it off at the 
point where it intersects the line y = 1. This is found to take place at x = — ,/6/2 
= — 1.225. Accordingly we define 


2 _ 
1 — 24 2 + 16 4 in [ — /6/2,0], 
E4(x) in [0, 00). 
As before, we shift the origin to — /6/2 and define the function 
6 | 
(13.3) SX) = g(x — a for x 2 0 (see Figure 6). 
We find that - 
| fo + = —fo(0) = 48,/6/10 = 11.7576 
(13.4) | f+ = ZO) = 48 
"|. = —£'O) = 384,/6/10 = 94.0604. 


These values are surely Jower bounds for the best constants y4_, of (13.2). However, 
our fo(x) is certainly not an extremizing function. This can be seen from Figure 6 


1973] THE ELEMENTARY CASES OF LANDAU’S PROBLEM 157 


x= 4(/6 —/3) = 358 B= 1/6 ~ 1) = .725 
Fic. 6. 
because the first minimum value of fo(x) is = — 4/5 and thereby fails to reach 
down to the line y = — 1. However, I do not know any explicitly defined function 


f(x), satisfying (13.1), whose norms are superior to the norms (13.4) of f(x). 
At this point we state (see [11]) 


THE R4. —-ANALOGUE OF KOLMOGOROV’S THEOREM. Let n = 2. There is a spline 
+ Pp 


function e,(x) of degree n, satisfying || e,||+ = 1, |] es” 4 = yn, with the following 
property: If 

(13.5) Ifl+ Sb [PP]. Sree 

then 

(13.6) [FM] S fe]. =[ePo|, (v = 1,2,---,n — 1). 


These are the best constants because we have equalities if f(x) = e,(x). If n 2 3, 
then + e,(x) are the only functions with these properties. 


We call e,(x) the one-sided Euler spline of degree n. Just like &,(x), also e,(x) 
has the property that e“”(x) is a step-function assuming the values +, only. 
Figures 4 and 5 show the graphs of e,(x) and e3(x), respectively. The knots of fo(x) 
(Figure 6) are at its zeros B + 1,8 +2,---. The graph of e,(x) looks much like the 
graph of fo(x) (Figure 6), except that also its first minimum is = — 1. However, 
the knots of e,(x) do not agree with its zeros, but approach them in the limit as 
we approach +00. 

No explicit expressions are known for e,,(x) (n = 4). Rather e,,(x) is defined in [11] 
as the limit of a sequence of spline functions of degree n, that are themselves defined 
by minimum properties. In deriving the numerical results of [11] good approximation 
of e,(x), for n = 4, 5, 6, are used. These approximations furnish for n = 4 the 
values of the best constants in (13.2): 


158 I. J. SCHOENBERG 


Ye, = —e(0) = 12.695 
Yar = €4(0) = 50.393 
Yi3 = —ey(0) = 96.197. 


In conclusion let me say the following. The Landau problems are extremum 
problems. Faced with an extremum problem we are often well on the way to its 
solution, provided that we are lucky enough to guess what the extremizing function is. 
The extremizing functions &,(x) of the R-problem are beautiful, simple, and easily 
computable functions. This is decidedly not the case of e,(x), ifn 2 4, and this is 
the reason why the R,-problem was more difficult to solve. 


This work was sponsored by the Mathematics Research Center, Madison, Wisconsin, under 
Contract No. DA-31-124-ARO-D-462. 


References 


1. M. Abramovitz and I. A. Stegun (Editors), Handbook of mathematical functions with formu- 
las, graphs and mathematical tables, National Bureau of Standards, Washington, D. C., 1964. 

2. T. Bang, Une inégalité de Kolmogorof et les fonctions presque-périodiques, Danske Vid. 
Selsk. Math. Fys. Medd., 19 (1941) No. 4, 28 pages. 

3. W. A. Coppel, Stability and asymptotic behavior of differential equations, Health, Boston, 
1965. 

4. A. Kolmogorov, On inequalities between the upper bounds of the successive derivatives of an 
arbitrary function on an infinite interval, Amer. Math. Soc. Translations, Series 1, 2 (1962) 233-243. 
This paper appeared originally in Russian in 1939. 

5. E. Landau, Einige Ungleichungen fiir zweimal differentzierbare Funktionen, Proc. London 
Math. Soc., (2) 13 (1913) 43-49. 

6. , Die Ungleichungen fiir zweimal differentzierbare Funktionen, Danske Vid. Selsk. 
Math. Fys. Medd., 6 (1925) No. 10, 49 pages. 

7. , Uber einen Satz von Herrn Esclangon, Math. Ann., 102 (1929) 177-188. 

8. A. P. Matorin, On inequalities between the maxima of the absolute values of a function and 
its derivatives on a half-line, Amer. Math. Soc. Translations, Series 2, 8 (1958) 13-17. 

9. N. E. NGrlund, Vorlesungen iiber Differenzenrechnung, Springer Verlag, Berlin, 1924. 

10. I. J. Schoenberg, Cardinal interpolation and spline function II. Interpolation of data of power 
growth, MRC T. S. R. 1104, Madison, Wisconsin, 1970. To appear in J. of Approx. Theory. 

11. , and A. Cavaretta, Solution of Landau’s problem concerning higher derivatives on 
the halfline, MRCT. S. R. 1050, Madison, Wisconsin, 1970. Also in Proc. of the Intern. Conf. 
on Constructive Function Theory, Golden Sands (Varna) May 19-25, 1970, Publ. House Bul- 
garian Acad. Sci., Sofia, (1972) 297-308. The MRC T. S. R. 1050 is a more accurate version of 
the paper. 

12. , Cardinal interpolation and spline functions VII. The behavior of cardinal spline 
interpolants as their degree tends to infinity, MRC T. S. R. 1184, Madison, Wisconsin, 1971. To 
appear in J. d’Analyse Math. (Jerusalem). 

13. , Cardinal interpolation and spline functions VIII.To appear as an MRC T. S. Report. 

14. J. N. Subbotin, On the relation between finite differences and the corresponding derivatives, 
Proc. Steklov. Inst. Math., 78 (1965) 24-42. Amer. Math. Soc. Translations, (1967), 23-42. 


TYPES OF FULLY ORDERED GROUPS 
D. P. MINASSIAN, Butler University, Indianapolis 


Introduction. In this paper we shall introduce the subject of fully ordered groups 
and shall then examine some of these groups as discussed by Russian mathematicians 
in the last decade. (Apparently, translations of the Russian references are not in 
print.) We shall exhibit relationships among the groups introduced by the Russians 
and between them and more traditional classes of ordered groups. In Section 1 we 
shall give somewhat more than is needed as a basis for Section 2 since I assume 
that a study of fully ordered groups is new to most readers and so shall present some 
of the more interesting results (cf. [2]). 

The following table of abbreviations is a quick reference to some notation we 
shall use, although complete definitions also appear in the text. 


Table of Abbreviations 


Nssg(x,y ...): the normal subsemigroup of a group G generated by the elements x,y... of G. 

O-group: a group which admits some full order. 

O*-group: a group in which each partial order can be extended to some full order. 

S-ext group: a group G in which every full order for each subgroup can be extended to some full 
order for G. 

Sa-ext group: a group G in which every full order for each abelian subgroup can be extended to 
some full order for G. 

Sn-ext group: a group G in which every full order for each normal subgroup can be extended to some 
full order for G. 

San-ext group: a group G in which every full order for each abelian normal subgroup can be 
extended to some full order for G. 

S*-ext group: a group G in which every partial order for each subgroup can be extended to some full 
order for G. 

Sn*-ext group: a group G in which every partial order for each normal subgroup can be extended 
to some full order for G. 


1. Preliminaries on ordered groups. 


DEFINITIONS. A partial order for a group G is a relation <, “‘less than or equal 
to,’’ on G with the usual properties: < is reflexive, transitive, antisymmetric (“‘a S b 
and b < a’’ imply a = b), and satisfies ‘“‘a < b if and only if xay < xby for each 
x and yin G.’’ A full or linear or simple order for G is a partial order for G such 
that each a and b in G are comparable (either a < b or b S a). An example of a 


Donald Minassian received his Univ. of Michigan Doctorate under E. F. Krause. Previous to 
that he taught mathematics, American history, and Latin in secondary schools and at two-year 
colleges, and took part in several summer institutes. Since taking his Degree he has been an Associate 
Professor at Butler University. He has worked on ordered groups and various topics in economics. 
Editor. 


159 


160 D. P. MINASSIAN [February 


fully ordered group is the group of additive integers under the familiar ordering; 
we shall give other examples below. An O-group (orderable group) is a group that 
admits some full order while the stronger O*-group is one for which each partial 
order can be extended to some full order (i.e., to a full order which may vary de- 
pending on the initial partial order). 

We state some basic facts and begin with two theorems giving necessary and 
sufficient conditions that an abstract group be an O-group or an O*-group. 


THEOREM 1.1 ((16] [17] [21]). A group G with identity element e admits a full 
order (i.e., G is an O-group) if and only if, given a,,-:-,a, in G with each a; #e, 
then for at least one choice of the signs d,; = +1 one has 


e ¢ Nssg(a‘!,--:, a4"), 


where Nssg(x, y,-::) denotes the normal subsemigroup of G generated by x,y--:; 
that is, Nssg(x, y,°::) consists of all products of conjugates of x, y, etc. 


REMARKS. “‘Only if’’ is easy to show (cf. Remarks following Theorem 1.2 be- 
low). Theorem 1.1 is also a corollary to a result of Fuchs on the extension of a 
given partial order for a group to some full order for the group; see [2, p. 34 to p. 36 
line 4]. Another set of conditions for a group G to admit a full order is given in 
[2, pp. 50-54]. Briefly, G is an O-group if and only if G admits a ‘“‘solvable normal 
system’? » of subgroups which satisfies certain additional properties. More spe- 
cifically, 2 is a chain (under ©) containing {e} and G and is closed under unions, 
intersections, and conjugation by elements of G, and meets certain other require- 
ments particularly regarding the ‘‘jumps.’’ (A “‘jump”’ is a pair C Z D of distinct 
elements of & with no element of X in between.) For example, in any jump C 7 D 
the subgroup D is normal in C, and C/D is isomorphic to a subgroup of the 
additive real numbers. (Hence C/D is abelian, whence the solvable normal system 
referred to above.) Note: if G is fully ordered, the elements of one such are 
the convex subgroups of G; see Definition above Proposition 1.4. 


THEOREM 1.2 [22]. A group G has the property that each partial order can be 
extended to some full order (i.e., G is an O*-group if and only if 

(i) if b and c are in Nssg(a), then Nssg(b) and Nssg(c), intersect, and 

(ii) if a # e, then e ENssg(a), where, as above, Nssg(x) denotes the normal 
subsemigroup of G generated by x. 


REMARKS. Condition (ii) is easily seen to be equivalent to “if a # e, then the 
intersection of Nssg(a) and Nssg(a~*) is null.’’ Further, if a group G satisfies (ii), 
then G is called generalized torsion-free. We note (cf. Theorem 1.1) that any O-group 
G is generalized torsion-free (and hence torsion-free), for an equation of form 


(*) I (x,ax; ') =e 


1973] TYPES OF FULLY ORDERED GROUPS 161 


would imply that a product of elements x,ax;~' each greater than e, or each less 
than e, equals e, which is impossible; in the abelian case a strong converse holds: 


CorOLLaryY 1.3. Jf G is a torsion-free abelian group, then each partial order for 
G extends to some full order for G (i.e., G is an O*-group). 


Proof. In the abelian case (ii) of Theorem 1.2 is equivalent to “G is torsion- 
free’’ since (*) of Remarks above reduces to a* = e. Also, (i) holds in any abelian 
group since if b = a" and c = a", then a™" is in both Nssg(b) and Nssg(c). (Note: 
an easy direct proof of this corollary is given in [1].) 


DEFINITION. The set of all elements g in a partially ordered group G satisfying 
e < g, where e is the group identity, is the (nonnegative) cone P. 

In any ordered group the cone P determines the ordering since a < b if and only if 
e =aa_' < ba™"’, that is, if and only if ba! is in P. Thus P itself is often called 
‘‘a partial order’’ for G. It is easy to show [2, p. 13, Theorem 2] that a subset P 
of a group G is acone for Gif and only if P is a normal subsemigroup of G satisfying 
POP™' = {e}, where P~* consists of all the inverses of the elements of P. 
Clearly, such P is the cone of a full order for G if and only if PUP-' =G. 

Each partial order P for a group G induces a partial order on any subgroup H: 
simply let the cone for H be POH. Clearly, if P is a full order for G, then POH 
is a full order for H. Now suppose the subgroup H is normal in the fully ordered 
group G. Does P induce an order on the factor group G/H —that is, an order 
under which Hg is positive in G/H if and only if g is positive in G? To help answer 
this question we give another definition. 


DEFINITION. A subset S of the partially ordered group G is convex in G if, 
for each pair a and b of elements in S, the relation a < g S b always implies that 
g is in S. (For example, under the familiar ordering of the real line the convex 
subsets are the ordinary intervals. In example 2 below, the pure imaginary numbers 
yi are a convex subgroup.) 


PROPOSITION 1.4. If the normal subgroup N of the fully ordered group G is 
convex in G, then G/N admits the induced partial ordering: Ng is positive if and 
only if g is positive in G. 


Proof. We define the cone for G/N to consist of the identity coset together with 
those cosets Ng where g¢ N is positive in G; (we must still prove this defines a cone). 
Now for such g the fact that N is convex in Grules out any relation of forme S g <n, 
where n is in N. Thus we conclude that g exceeds all n in N, and so all elements 
of such Ng are positive in the given order for G. Hence it does not matter how the 
coset Ng is represented, i.e., ‘‘once positive, always positive’’. To verify that the re- 
quirements for a cone on G/N are met is now routine, and we omit the details. Note 
that the cone on G/N is the image, under the natural map G — G/N, of the cone 
for G; hence the term induced order on G/N. 


162 D. P. MINASSIAN [February 


The converse to Proposition 1.4 also holds; ie., if N is a normal subgroup of 
the fully ordered group G such that G/N admits the induced order, then N is convex 
under the given order for G. For in this case the natural homomorphism G > G/N 
preserves orderand hence, as can be easily shown for any order-preserving map 
between partially ordered sets (sic), the preimage of a convex subset is convex; 
(note: if both sets are fully ordered and the map is onto, then convexity is preserved 
in both directions). In particular, then, the kernel N of the homomorphism is a convex 
subset of G since N is the preimage of the ‘“‘vacuously convex’’ identity subgroup. 

Here is a related question: If G does not have an ordering to begin with, then 
when do given orders on N and G/N give rise to an order for G which induces the 
given orders on N and G/N? 


PROPOSITION 1.5. Jf N is a normal subgroup of the group G, and if both N and 
G/N are partially ordered, then in order that G admit a partial order inducing 
those on N and G/N, it is necessary and sufficient that the cone for N be invariant 
in G (i.e., under conjugation by all elements of G). Clearly, if the orders for N 
and G/N are full, then so is the order for G. 


Proof. Necessity follows from the normality in G of N (given) and of the cone 
for G. For sufficiency, we let the reader verify that the conditions for a cone for G 
are satisfied by the union of the cone for N and the (strictly) positive cosets of G/N. 

It is immediate that in any group admitting a full order the equation x" = a 
has at most one solution for each a (such groups are sometimes called R-groups): 
for if c < d are two such solutions, then c” < d", a contradiction. On the other hand, 
not all R-groups are O-groups; for an example see [14]. 


DEFINITION. A fully ordered group G is archimedean if the relation 


a" = b for all integers n 


always implies a = e. 

The following two theorems show that all archimedean, and all continuous 
(see Theorem 1.7), fully ordered groups are essentially subgroups of the additive 
real numbers. 


THEOREM 1.6. ([4, pp. 13-14] [2, p. 45]). A fully ordered group G is archimedean 
if and only if there is an isomorphism, which preserves order, from G onto a 
subgroup of the naturally ordered additive group R of real numbers. That is, 
such G are subgroups of R. 


THEOREM 1.7 ((15], [2, p. 47]). If G # {e} is a continuous fully ordered group 
(i.e., each Dedekind section determines one and only one element), then there 
is an isomorphism, which preserves order, from G onto the naturally ordered 
additive group R of real numbers. 


1973] TYPES OF FULLY ORDERED GROUPS 163 


The following result provides many examples of nonabelian fully ordered 
groups, as the corollary illustrates. 


THEOREM 1.8 ([25]). The free product of fully ordered groups admits a full order. 
CorROLLARY 1.9. All free groups admit a full order. 


Proof. A free group is the free product of infinite cyclic groups, each of which 
inherits the natural full order (or its negative) from the real numbers. (For another 
proof see, e.g., [2, pp. 47-49]. In fact, in an analog to the famous result that every 
group is the homomorphic image of a free group, B. H. Neumann and K. Iwasawa 
independently have proved that every fully ordered group is the image, under an 
order-preserving homomorphism, of a fully ordered free group; see, e.g., [2, p. 49, 
Theorem 9]. The proof there also shows that each partially ordered group is the 
image, under an order-preserving homomorphism, of a partially ordered free group 
where the kernel is fully ordered.) 

The following are examples of fully ordered groups: 

1. Any subgroup of the additive group of real numbers under the natural ordering. 

2. Any subgroup of the additive group of complex numbers under the lexi- 
cographic ordering: x + yi S$ Oifx <0,orifx = Oand y S 0. Note that extension 
of the lexicographic scheme fully orders the n-fold Cartesian product of fully ordered 
groups for each finite n. (More generally, a Cartesian product, under a well-ordered 
indexing set, of fully ordered groups admits a full order.) Note also that the order 
relation is non-archimedean (for instance, ni <1 for all integers n yet i #0). 

3. A nonabelian example is the multiplicative group of real matrices of form 


1 ae e 
0 1 b 
0 O 1 


where the cone P consists of matrices where a > 0, ora = ODandb>0, ora=b=0 
and c 2 0. 
4. See nonabelian example in Section 2 and order G lexicographically. 


REMARKS. All these groups are actually O*-groups (each partial order extends 
to some full order), the first two in view of Corollary 1.3 and the last two by (for 
instance) the result in [9] that any 2-step solvable group which admits a full order 
is an O-*group. (The fact that G in example 4 is 2-step solvable is shown in Section 2. 
The group in example 3 is 2-step solvable since it is an extension of the abelian 
group of all such matrices where a = 0 by the abelian group of matrices where 
b = c = 0; we omit the routine verification.) To illustrate, in example 1 the partial 
order on the integers under which the positive cone P consists only of the non- 


164 D. P. MINASSIAN [February 


negative even integers extends to a full order (the usual one) for the integers. In 
example 2, the usual order on the subgroup of real numbers is a partial order for 
the whole complex group which extends to the given full order. Like extensions 
apply in examples 3 and 4. 


2. Russian work on ordered groups. The chart below is an attempt to portray 
visually the relationships discussed. 


DEFINITIONS. A group G is a S-ext group if every full order for each subgroup 
of G extends to some full order for G. (For the distinguishing properties of such 
groups see Theorem 2.2.) Similarly G is a Sa-ext (resp. Sn-ext, San-ext) group if 
every full order for each abelian subgroup (resp. normal, abelian normal sub- 
group) of G extends to some full order for G. Group G is called a S*-ext group 
if each partial order for any subgroup of G extends to some full order for G. (The 
Russian literature uses the notation V-group, VA-group, VN-group, VAN-group, 
V*-group respectively for S-ext group, Sa-ext group, Sn-ext group, San-ext group, 
S*-ext group.) An exhaustive list of references to these groups is [5], [8], [11], [12], 
[20], [23], [24], but we shall be somewhat selective in quoting the results therefrom. 

If G satisfies any of the above definitions, then G admits a full order (i.e., G is an 
O-group) since the trivial order on the identity subgroup extends to a full order for G. 
Clearly, a S*-ext group is a S-ext group, and a S-ext group is both a Sa-ext and a 
Sn-ext group, each of which is a San-ext group. Further, the designations 
Sa*-ext group (every partial order for each abelian subgroup extends to some full 
order for G) and San*-ext group (obvious meaning) are superfluous. For G is a 
Sa*-ext (resp. San*-ext) group if and only if G is a Sa-ext (resp. San-ext) group 
because any partial order for any abelian subgroup 4 of a torsion-free group G 
extends to a full order for H by Corollary 1.3. 

On the other hand, it is unknown if a Sn-ext group is a Sn*-ext group. However, 
we shall see that a solvable group G 1s a Sn-ext group if and only if G is a Sn*-ext 


group. 
The situation for abelian groups is very simple. 


PROPOSITION 2.1. For an abelian group G all of the designations S*-ext, S-ext, 
Sn*-ext, Sn-ext, Sa-ext, San-ext (thus also, Sa*-ext and San*-ext in view of their 
superfluousness, noted above), O*, O, torsion-free, are equivalent. 


Proof. For any group (even if not abelian) the designation S*-ext clearly implies 
all the others, while torsion-free is implied by all the others. Thus all we need show 
is that “abelian and S*-ext’’ implies “‘torsion-free’’ and, conversely, “abelian and 
torsion-free”’’ implies S*-ext. In fact, the first “abelian” is superfluous. For if G, abelian 
or not, isa S*-ext group, then the trivial order (only e is in the cone) extends to a full 
order for G; hence G is an O-group and thus torsion-free. Conversely, suppose an 
abelian group G is torsion-free. Then a partial order P for a subgroup of G is a par- 


1973] TYPES OF FULLY ORDERED GROUPS 165 


tial order for G since P is normal in abelian G. Thus, P extends to a full order for G 
by Corollary 1.3. Hence G is an S*-ext group. 

The situation for a nonabelian group G is not so cut and dried. A principal tool is 
this theorem of Kargapolov [5, Theorem, p. 17]: 


THEOREM 2.2. An arbitrary torsion-free group G has the property that each 
full order for any subgroup of G extends to some full order for G (i.e., G is a S-ext 
group) if and only if G has an abelian normal subgroup A such that (1.) the factor 
group G/A is abelian, and (2.) for arbitrary elements a in A and b in G—A there 
are positive integers mn such that b™'a™b = a". 


REMARKS. Condition 2 shows that any such G is metabelian, by which we mean 
either one-step solvable, i.e., abelian, or two-step solvable. Necessity for Theorem 
2.2 is essentially due to Terehov [23]. The fact that a, above, is not arbitrary in G 
but must lie in A, is omitted from the statement of the theorem, but not the proof, 
in [5]; that a is not arbitrary is trivial—set a = b ¥ e. Also, it is clearly unneces- 
sary to state, as in [5], ‘a # e.” 

Here is an outline of the proof: 


Sufficiency: One may verify that G/A is torsion-free. Now let H be any subgroup 
of G with a full order P. Define A, = HOA and P, = POA. Clearly, A, is a 
normal subgroup of H, and so H/A, is a group. It is shown that A, is also convex 
in H (see the definition in section 1). Thus (see Proposition 1.4) under the natural 
map H - H/A, the cone P induces an order P on H/A, as follows: A,x is in P 
if and only if x is in P. Further, under the natural isomorphism A,h — Ah between 
H/A, and HA/A, the cone P gives rise to a partial order on G/A 2 HA/A which 
extends to a full order 0 on G/A since G/A is a torsion-free abelian group and hence 
O* by Corollary 1.3. Likewise the cone P extends to a full order Q, for the torsion-free 
abelian group A. Since Q, is invariant in G by (2), one may construct (see Propo- 
sition 1.5) a full order Q for G as follows: x in G belongs to Q if and only if x is 
in Q,, or x is not in A and Ax is in Q. Finally, Pc Q. 


Necessity (we give but a brief sketch): The system [N,,] of all convex subgroups 
of G forms a solvable normal system (cf. [2, pp. 50-54]). Each N, is normal in G 
and each serving subgroup of N,4,/N, is normal in G/N,. (A serving subgroup H 
of a group G is a subgroup of G such that, for each h in H and each natural number 
n, the equation x” = h can be solved in H if it can be solved in G.) This helps 
show G/Z is abelian, where Z is the intersection of the preimages of the centralizers 
Za+1/Nq Of the factors N,,,/N, under the natural homomorphisms G ~ G/N,. 
Also, the group Z is proved abelian. Let A be a maximal abelian normal subgroup 
of G containing Z. Each serving subgroup of A is normal in G and so (2) holds. 


REMARK. In [24] Terehov shows that any group G satisfying the conditions of 


166 D. P. MINASSIAN [February 


Theorem 2.2 can be embedded in a S-ext group where the abelian normal subgroup 
corresponding to A is a divisible group, i.e., contains all roots of each of its elements. 


A nonabelian example (Terehov [24, bottom of p. 35]). Let G be the set of all 
ordered pairs (x, y) with integer x and rational number y under the operation 


(x,y) @(z,q) = (x +2z,ry+4q), 


where r is any fixed positive rational number except 1. It is routine to check that G 
is a torsion-free nonabelian group, and that the subset A = {(0, y), all rational y} 
is a normal subgroup of G such that A and G/A are isomorphic to the additive 
rational numbers and additive integers respectively. Further, if (0, y) is in A and 
(z,q) is in G—A, then 


(2, q) ® (0, y) ® (z,q)~* = (0,r'y), 


Zz 


where r’ = r* is a positive rational number + | by the definition of r and because 
z #0. Thus r’ = n/m, where m # n are positive integers, and hence 


m[(z,q) ® (0, y) @ (z, q)~*] = n(0, y). 


Thus G and A satisfy the hypotheses of Theorem 2.2, and G is a S-ext group. 
From the definitions it is immediate that any subgroup of a S-ext (resp. S*-ext, 
Sa-ext) group inherits this property. (This remark and that on direct products of 
S-ext groups, below, have somewhat wider application in view of the chart.) Such 
“subgroup theorems” are of interest in ordered groups. For example, it was a classical 
unsolved problem if every subgroup of an O*-group is an O*-group; a counterexample 
is in [13]. (Trivially, each subgroup of a fully ordered group inherits a full order — the 
induced one — as noted in Section 1; 1.e., any subgroup of an O-group is an O-group.) 
Regarding “‘subgroup theorems,” Terehov [| 23, p. 36] points out that a group is a 
S-ext group if every finitely generated subgroup is a S-ext group while Kokorin [8] 
gives the same result for Sn-ext groups. (These results have wider application by the 
chart; also, like results hold for O-groups and O*-groups as immediate consequen- 
ces of Theorems 1.1 and 1.2 since the conditions listed there are local properties. ) 
Also important is whether a given class of ordered groups is closed under direct 
products. Kargapolov [6], and Kokorin [10] independently, show that the restricted 
direct product of O*-groups is an O*-group. (This generalizes a result of Kokorin [8] 
that the restricted direct product of S-ext groups is an O*-group, since any S-ext 
group is an O*-group in view of the chart to be discussed.) However, Kargapolov [6] 
shows that the class of O*-groups is not closed under the complete direct product, 
and I prove [20] that the direct product of S-ext groups need not be a S-ext group. 
The class of O-groups is easily seen to be closed under both restricted and complete 
direct products (under a well-ordered indexing set) by “lexicographic ordering” ; 
see example 3 of Section 1. 


1973] TYPES OF FULLY ORDERED GROUPS 167 


In conclusion we prove, or refer to proofs of, the assertions in this chart for 
groups G, abelian or not (cf. Proposition 2.1): 


nilpotent + San-ext <> torsion-free abelian=- 
metabelian -+ S *-ext <> metabelian + S-ext<> 
metabelian + Sn*-ext <> metabelian + Sn-ext <> 


metabelian -+ Sa-ext <> metabelian -+ San-ext <> 


S-ext 


o* 


f. 


NN San-ext ———> 9 => torsion-free. 


Sn -ext 


S*-ext 


metabelian 
Fic. 1 


‘‘Solvable’’ can replace ‘‘metabelian’’ without loss to the validity of the chart. 

Note that we have already established (or the definitions will easily establish) 
most of the implications on the chart. Those not yet established are the fourth 
and the sixth, < of the third, fifth and seventh, and => of the first, eighth and ninth. 

The next theorem is like Theorem 2.2, but. together the two yield important 
Corollary 2.4 below. 


THEOREM 2.3 [11, Theorem on p. 21]. A solvable group G is a S*-ext (resp. 
S-ext, Sa-ext) group if and only if G contains a torsion-free abelian normal sub- 
group A such that for arbitrary g¢A there are positive integers m # n satisfying 
g ‘a"g =a" for all a in A. (Thus the labels S*-ext, S-ext and Sa-ext coincide 
for solvable groups.) 


168 D. P. MINASSIAN [February 


CoROLLARY 2.4. The labels S-ext and S*-ext coincide for all groups. 


Proof. Theorems 2.2 and 2.3. 

Because of the importance of this last result (strangely unstated in the literature) 
we emphasize it: if a group G has the property that every full order for each subgroup 
of G extends to some full order for G, then every partial order for each subgroup of 
G extends to some full order for G. 


REMARK. Classes O and O* differ; all nonabelian free groups are examples of O- 
groups which are not O*, but the proofs are nontrivial — see, e.g., [3]. In fact, in 
[6] and [7] are examples of solvable O-groups which are not O*-groups. 


THEOREM 2.5. (from [5] and [23]). A solvable San-ext group is a S*-ext group. 


REMARK. Terehov [23] actually shows, on pp. 34-35, that a solvable S-ext group 
meets the conditions of Theorem 2.2, but his proof applies without essential change to 
solvable San-ext groups; now use Theorem 2.2 and Corollary 2.4. 

We now have more than enough to establish: 


COROLLARY 2.6. The labels S*-ext, S-ext, solvable S*-ext, solvable S-ext, solvable 
Sn*-ext, solvable Sn-ext, solvable Sa-ext, and solvable San-ext coincide. ‘Metabelian’ 
may replace ‘solvable.’ 


Every claim on the chart is now established except => of the first arrow. But this is 
Corollary 1 of Lemma | of [8, p. 25]. Thus any nilpotent group from the ‘new classes’ 
is abelian. (It is surprising that there are 2-step solvable, but not class 2 nilpotent, 
groups among the groups introduced in the Russian work.) Thus, nonabelian 
S*-ext groups, if finitely generated, are not locally nilpotent; for example, I show in 
[20] that the group {a,b | ba = ab} isa S-ext group, and hence S*-ext by Corollary 
2.4. This yields a stronger result than that of Livchak [14] that an O*-group need not 
be locally nilpotent. 

We have already noted that any solvable group from among the Russian groups 
is metabelian. This situation and that described in the previous paragraph contrast 
sharply from what holds for the ‘‘old classes’’. For example, I show [19] that 
there are O*-groups of arbitrary solvable length as well as unsolvable O*-groups, 
while Malcev [18] shows any torsion-free locally nilpotent group is an O *-group. 

Finally, in a result relating the new and old groups Kopytov [12] shows that any 
S-ext group can be embedded in a divisible O*-group. 


This paper was completed by the author during his Faculty Fellowship at Butler University. 


References 


(Note: reference [2] has appeared in a revised German edition as Teilweise geordnete algebraische 
Strukturen, Studia Mathematica, Band xix, published by Vandenhoeck und Rupprecht, Gottingen, 
1966. However, the footnote on p. 66, lines 5-13, is the only mention of the Russian work I have 


1973] TYPES OF FULLY ORDERED GROUPS 169 


discussed. Further, the German edition contains nothing new on the fundamentals of fully ordered 
groups as discussed in section 1 of this paper. Thus we refer only to the more accessible English 
edition.) 


1. C.J. Everett, Note on a result of L. Fuchs on ordered groups, Amer. J. Math., 72 (1950) 216. 

2. L. Fuchs, Partially Ordered Algebraic Systems, Pergamon, Oxford, 1963. 

3. and E. Sasiada, Note on orderable groups, Ann. Univ. Sci. Budapest. Edtvés Sect. 
Math., 7 (1964) 13-17. 

4. O. Hélder, Die Axiome der Quantitat und die Lehre vom Mass, Ber. Verh. Sachs. Ges. 
Wiss. Leipzig. Math.-Phys. KI., 53 (1901) 1-64. 

5. M.I. Kargapolov, Completely ordered groups (Russian), Algebra i Logika, (2) 1 (1962) 
16-21. MR 27 (1964) 462569. 


6. , Fully orderable groups. I (Russian), Algebra i Logika, (6) 2 (1963) 5-14. MR 30 
(1965) 4+¢ 3156. 
7. , A.I. Kokorin and V.M. Kopytov, On the theory of orderable groups (Russian), 


Algebra i Logika, (6) 4 (1965) 21-27. MR 33 (1967) + 4162. 

8. A. I. Kokorin, On the theory of completely orderable groups (Russian), Ural. Gos. Univ. 
Mat. Zap., (3) 4 (1963) 25-29. MR 32 (1966) +1271. 

9, , On the theory of orderable groups (Russian), Algebra i Logika, (6) 2 (1963) 15-20. 
MR 30 (1965) + 3157. 

10. , Ordering a direct product of ordered groups (Russian), Ural. Gos. Univ. Mat. 
Zap., (3) 4 (1963) 95-96. MR 29 (1965) 4 5938. 

11. and V. M. Kopytov, Certain classes of ordered groups (Russian), Algebra i Logika, 
(3) 1 (1962) 21-23. MR 27 (1964) 4¢5840. 

12. V.M. Kopytov, Completion of completely orderable groups (Russian), Ural. Gos. Univ. 
Mat. Zap., (3) 4 (1963) 76-77. MR 32 (1966) +¢ 1272. 

13. , On the theory of preorderable groups (Russian), Algebra i Logika, (6) 5 (1966) 
27-31. MR 34 (1967) + 4388. | 

14. Ja. B. Livchak, On orderable groups (Russian), Uchen. Zap. Ural. Gos. Univ., 23 (1959) 
11-12. MR 29 (1965) 4¢ 5935. 

15. F. Loonstra, Ordered groups, Neder]. Akad. Wetensch., Proc., 49 (1946) 41-46. 

16. P. Lorenzen, Uber halbgeordnete Gruppen, Arch. Math., 2 (1949) 66-70. 

17. J. Los, On the existence of linear order in a group, Bull. Acad. Polon. Sci. Cl. Tl, 2 (1954) 
21-23. 

18. A. I. Malcev, On the completion of group order (Russian), Trudy Mat. Inst. Steklov., 38 
(1951) 173-175. MR 14 (1953), p. 13. 

19. D. P. Minassian, On solvable O*-groups, Pacific J. Math., 39 (1971) 215-217. 

20. , On the direct product of V-groups, Proc. Amer. Math. Soc., 30 (1971) 434-436. 

21. M. Ohnishi, Linear-order on a group, Osaka Math. J., 4 (1952) 17-18. 

22. ————, On linearization of ordered groups, Osaka Math. J., 2 (1950) 161-164. 

23. A.A. Terehov, Completely orderable groups (Russian), Dokl. Akad. Nauk. SSSR, (1) 
129 (1959) 34—36. MR 22 (1961) +734. 

24. , The structure of locally solvable completely orderable groups (Russian), Algebra 
i Logika, (2) 1 (1962) 10-15. MR 27 (1964) + 2568. 

25. A. A. Vinogradov, On the free product of ordered groups, (Russian), Mat. Sb., 25 (1949) 
163-168. MR 11 (1950), p. 157. 


THE WILLIAM LOWELL PUTNAM 
MATHEMATICAL COMPETITION 


J. H. McKAY, Oakland University 


The following results of the thirty-second William Lowell Putnam Mathematical 
Competition held on December 4, 1971 have been determined in accordance with 
the regulations governing the Competition. This competition is supported by the 
William Lowell Putnam Intercollegiate Memorial Fund left by Mrs. Putnam in 
memory of her husband and is held under the auspices of the Mathematical Associa- 
tion of America. 

The first prize, five hundred dollars, is awarded to the Department of Mathematics 
of California Institute of Technology, Pasadena, California. The members of the 
team were Bruce Reznick, David Smith, and Michael Yoder; to each of these a 
prize of one hundred dollars is awarded. 

The second prize, four hundred dollars, is awarded to the Department of Mathe- 
matics of the University of Chicago, Chicago, Illinois. The members of the team 
were Robert Israel, David Saltman, and Robert Tax; to each of these a prize of 
seventy-five dollars is awarded. 

The third prize, three hundred dollars, is awarded to the Department of Mathe- 
matics of Harvard University, Cambridge, Massachusetts. The members of the 
team were Ira Gessel, David Harbater, and Jonathan Rosenberg; to each of these a 
prize of fifty dollars is awarded. 

The fourth prize, two hundred dollars, is awarded to the Department of Mathe- 
matics of the University of California at Davis, Davis, California. The members 
of the team were William Hamaker, Dean Hickerson, and Peter Loomis; to each of 
these a prize of fifty dollars is awarded. 

The fifth prize, one hundred dollars, is awarded to the Department of Mathe- 
matics of the Massachusetts Institute of Technology, Cambridge, Massachusetts. The 
members of the team were Richard Arratia, David Christie, and Don Coppersmith; 
to each of these a prize of fifty dollars is awarded. 

The six persons ranking highest in the examination, named in alphabetical order, 
are Don Coppersmith, Massachusetts Institute of Technology; Robert Israel, Univer- 
sity of Chicago; Dale Peterson, Yale University; Arthur Rubin, Purdue University; Da- 
vid Shucker, Swarthmore College; Michael Yoder, California Institute of Technology. 
Each of these has been designated as a Putnam Fellow by the Mathematical Associa- 
tion of America and is awarded a prize of two hundred and fifty dollars. 

The next four highest ranking individuals, named in alphabetical order, are 
Gerald Myerson, Harvard University, Bruce Reznick, California Institute of Technolo- 
gy; Jonathan Rosenberg, Harvard University, and Angelos Tsirimokos, Princeton 
University. To each of these a prize of one hundred dollars is awarded. 

The following teams, named in alphabetical order, won honorable mention: 


170 


WILLIAM LOWELL PUTNAM MATHEMATICAL COMPETITION 171 


Case Western Reserve University, the members of the team were Walter Augen- 
stein, Steven Kalikow, and Michael Somos, University of Michigan, the members of 
the team were Jonathan Glauser, Kenneth Rosen, and Dennis Stowe; Purdue Univer- 
sity, the members of the team were Paul Garrett, Michael O’Donnell, and Arthur 
Rubin; University of Toronto, the members of the team were Robert Anderson, 
Daniel Gautreau, and Daryl Geller; Yale University, the members of the team were 
Dale Peterson, Eric Rosenthal, and Robert Weissler. 

Honorable mention is given to the following thirty-two individuals, named in 
alphabetical order: Richard Arratia, Massachusetts Institute of Technology; Richard 
Bradley, Jr., Massachusetts Institute of Technology; Kenneth Brakke, University 
of Nebraska at Lincoln; Seth Breidbart, Harvard University; Wm. Randolph Franklin, 
University of Toronto; Paul Garrett, Purdue University; Daryl Geller, University 
of Toronto, John Gilbert, University of New Mexico, Daniel Grayson, University 
of Chicago; Charles Grinstead, Pomona College; Paul Hagedorn, University of 
Virginia; David Harbater, Harvard University; Dean Hickerson, University of 
California at Davis; David Jerison, Harvard University; Paul Lemke, Rensselaer 
Polytechnic Institute; Peter Loomis, University of California at Davis; James Lyon, 
Princeton University; Steven McKay, Massachusetts Institute of Technology; Peter 
Olver, Brown University; James Paulson, Princeton University; Richard Poppen, 
Pomona College; Arthur Rothstein, Reed College; Thomas Russell, Princeton 
University; David Saltman, University of Chicago; Eric Schechter, University of 
Maryland; Paul Selick, University of Toronto; David Smith, California Institute of 
Technology; Michael Somos, Case Western Reserve University; Dennis Stowe, Uni- 
versity of Michigan; Robert Tax, University of Chicago; David Thornley, Univer- 
sity of Minnesota; Ray White, Princeton University. 

The other individuals who were ranked in the top one hundred, arranged by 
college, are: George Hardy and Daniel Kenway, University of Alberta; Charles 
Kaufman, Bates College; William Hamaker, University of California at Davis; Glenn 
Stevens, University of California at Santa Barbara; David Dummit, California 
Institute of Technology; Walter Augenstein and Steven Kalikow, Case Western 
Reserve University; Robert Hummel, Gary Miller, and David Vogan, University of 
Chicago; Joel Kleinman, City College of New York, David Levner and Robert 
Wolpert, Cornell University; David Kreps, Dartmouth College; Marcy Barge, Fort 
Lewis College; Jeffrey Dielle, William Ganong, David Garlock, Orin Gensler, Ira 
Gessel, Harry Porta, and Karl Strom, Harvard University; Jerrold Tunnell, Harvey 
Mudd College; William Van Melle, University of Illinois; George Cornelius, JI- 
linois Institute of Technology; James Kuklinski, LaSalle College; Terry Andres, Uni- 
versity of Manitoba; Mark Leeper, University of Massachusetts, Amherst; Scott 
Brown, David Christie, Joseph Mirzoeff, Frank Morgan, Edward Wimmers, 
Massachusetts Institute of Technology; Nozar Azarnia, Miami University; Jonathan 
Glauser, and Kenneth Rosen, University of Michigan; John Reiser, Michigan State 
University; Tavan Trent, University of North Carolina; Timothy Augustine, Steven 


172 J. H. MCKAY [February 


Garavaglia, University of Notre Dame,; Craig Lee Huneke, Oberlin College; Ja- 
mes Lawrence, Oklahoma State University; Bradley Jackson, University of Oregon; 
David Kallman, University of Pennsylvania; Eric Verheiden, Portland State Univer- 
sity; Richard Enison, Pratt Institute; Joseph Tupper, III, Princeton University; 
Michael O’Donnell, Purdue University; Peter Liepa, Queen’s University, Canada; 
James Alexander, Rice University; Jerome Eastham, Jr., Southwestern at Memphis; 
Robert Anderson, and Peter deBuda, University of Toronto; Thomas Templeton, 
University of Wisconsin; Eric Rosenthal and Robert Weissler, Yale University.' 

One thousand five hundred and sixty-nine students from three hundred and 
fourteen colleges and universities in the United States and Canada participated in 
the examination on December 4, 1971. 

The Questions Committee, consisting of Warren S. Loud (chairman), Murray 
Klamkin, and Nathan S. Mendelsohn, prepared the problems (listed below) for the 


competition. 


1. Students at Tel Aviv University, which is ineligible as a university outside Canada and the 
United States, were permitted to write the examination under the supervision of Professor Harley 
Flanders and have their papers graded along with the others. One of these students, Ran Donagi, 
would have ranked eighth in the competition and three others; Danny Berand, Joel Vodevoz, and 
Amnon Dalcher, would have been listed in the top hundred. 


PROBLEMS. PART A 


A-1. Let there be given nine lattice points (points with integral coordinates) in three dimen- 
sional Euclidean space. Show that there is a lattice point on the interior of one of the 
line segments joining two of these points. 


A-2. Determine all polynomials P(x) such that P(x2 + 1) = (P(x))* +1 and P(0) = 0. 


A-3. The three vertices of a triangle of sides a, b, and c are lattice points and lie on a circle 
of radius R. Show that abc 2 2R. (Lattice points are points in the Euclidean plane with 
integral coordinates.) 


A-4. Show that for0 < e< 1 the expression (x + y)” (x2 — (2— e)xy + y2) is a polynomial 
with positive coefficients for sufficiently large and integral. For e = .002 find the 
smallest admissible value of x. 


A-5. A game of solitaire is played as follows. After each play, according to the outcome, 
the player receives either a or b points (a and 5b are positive integers with a greater than 
b), and his score accumulates from play to play. It has been noticed that there are thirty- 
five non-attainable scores and that one of these is 58. Find a and b. 


A-6. Let c be a real number such that n° is an integer for every positive integer n. Show that 
c is a non-negative integer. 


1973] WILLIAM LOWELL PUTNAM MATHEMATICAL COMPETITION 173 
PART B 


B-1. Let Sbeaset and let . bea binary operation on S satisfying the two laws 
Xox = x forall xin S, and 
(Xoy)oZ = (Y 0 Z) o X for all x, y, zin S. 


Show that o is associative and commutative. 


B-2. Let F(x) bea real valued function defined for all real x except for x = 0 and x = 1 and 
satisfying the functional equation FQX)+F{ (x~1)/x}=1 + x. Find all functions F(x) 
satisfying these conditions. 


B-3. Two cars travel around a track at equal and constant speeds, each completing a lap 
every hour. From a common starting point, the first starts at time ¢ = 0 and the second 
at an arbitrary later time ¢ = T > 0. Prove that there is a total period of exactly one 
hour during the motion in which the first has completed twice as many laps as the 
second. 


B-4._ A “spherical ellipse” with foci A, Bon a given sphere is defined as the set of all points P 


_~ —~ o~ 
on the sphere such that PA + PB = constant. Here PA denotes the shortest distance 
on the sphere between P and A. Determine the entire class of real spherical ellipses which 
are circles. 


B-5. Show that the graphs in the x-y plane of all solutions of the system of differential 
equations 


x" + y + 6x = 0," — x’ + 6y =0 (' = dfdt) 


which satisfy x’ (0) = y’ (0) = 0 are hypocycloids, and find the radius of the fixed 
circle and the two possible values of the radius of the rolling circle for each such solution. 
(A hypocycloid is the path described by a fixed point on the circumference of a circle 
which rolls on the inside of a given fixed circle.) 


B-6. Let 6 (x) be the greatest odd divisor of the positive integer x. Show that 
| Dx. 1 On) /n — 2x/3 | < 1, for all positive integers x. 


SOLUTIONS. PART A 


The number in parentheses, immediately following the problem number, is the number of partici- 
pants who received a score of 8, 9 or 10 (10 is maximum possible) on the problem. In the case of A-1, 
A-2, B-1 and B~2, this applies to all 1569 participants. For the other problems, the count applies 


only to the 1039 qualifiers. 


A-1 (136). The set of all lattice points can be divided into eight classes according 
to the parities of the coordinates, namely, (odd, odd, odd), (odd, odd, even), etc. 
With nine lattice points some two, say P and Q, belong to the same class, The mid- 
point of the segment PQ is a lattice point. 


174 J. H. MCKAY [February 


A-2 (176). P(0)=0, P(1)=[P(0)]?+1=1, PQ)= [P(1)]? +1=2, P(5) 
=[PQ2)}? +1=5, P(S* +1) =[P(5)]? + 1 = 26, etc. Thus the polynomial P(x) 
agrees with x for more values than the degree of P(x), so P(x) = x. 


A-3 (18). For a triangle with sides a, b,c, area = A and circumradius = R we have 
abc = 4 RA, But if the vertices are lattice points the determinant formula (or Pick’s 
Theorem or direct calculation) for the area shows that 2A is an integer. Hence 
2A 21, so that abc 2 2R. To obtain the formula abc = 4RA note that if « is the 
angle opposite side a, then side a subtends an angle 2a at the center and a = 2R sina, 


A=4be sina, 
A-4 (49). In the expansion of (x + y)"(x* — (2 —&)xy + y”) the coefficient of 


kt1.ntink : 
(,"4)-2-9(7) + is ) 


xy is 
(it) (pseea te 
k) \n—k +1 cit ~¢ a}. 
Now for fixed n consider the expression 


k n— 


WW = EFT tee 


If k is taken to be a continuous positive variable 


— (nt+1){k +1)? -——k + 1?} 
OO =k ik ee 


Hence ¢’(k) = 0 at k = n/2 and it follows easily that $(k) is minimum at k =n/2. 
We needn’t consider end point minima since it easily follows that for n >2 the 
polynomial has its first two and last two coefficients positive. We may also note that 
if the two mid-terms in the expansion are non-positive for a given odd value of n then 
for the next larger value of n the mid-term remains non-positive. Hence if the mid- 
coefficients become positive, the first value of n for which this occurs is odd. Now if 
—1 
nis odd and k = 4(n + 1) then @(k) = —- 1+¢6,and @(k) > Oforn > 3, 
If e = .002, n > 1997 and nis odd. Hence the minimum n for which all terms are 


positive is 1999, 


A-5 (17). The attainable scores are those non-negative integers expressible in the 
form xa + yb with x and y non-negative integers. If a and b are not relatively prime 
there are infinitely many non-attainable scores. Hence (a,b) = 1. It will be shown 
that the number of non-attainable scores is 4(a — 1)(b — 1). 

If m is an attainable score, the line ax + by =m passes through at least one 


1973] WILLIAM LOWELL PUTNAM MATHEMATICAL COMPETITION 175 


lattice point in the closed first quadrant. Because a and b are relatively prime, the 
lattice points on a line ax + by =m are at a horizontal distance of b. The first- 
quadrant segment of ax + by = mhasa horizontal projection of m/a and thus every 
score m 2 ab is attainable. Every non-attainable score must satisfy 0 S m < ab. 

If0 sm < ab, the first-quadrant segment of the line ax + by = m has a horizon- 
tal projection less than b, and so contains at most one lattice point. Thus there is a 
one-to-one correspondence between lattice points (x, y) with O S$ ax + by < abinthe 
first quadrant and attainable scores with 0 < m < ab. The closed rectangle 0 < x <b, 
0< y Sa contains (a + 1)(b + 1) lattice points, so the number of lattice points in 
the first quadrant with 0 S$ ax + by < abis4(a + 1)(b + 1) — 1. This is the number 
of attainable scores with 0 S m < ab. Hence the number of non-attainable scores in 
this range (which is all of them) is ab —4(a + 1)(b+1)+1=4(a—1)(6-1). 

In our given example 70 = (a — 1)(b — 1) = 1(70) = 2(35) = 5(14) = 7(10). The 
conditions a > b, (a,b) = 1 yield two possibilities a = 71, b=2anda=11,b=8. 
Since 58 = 71(0) + 2(29), the first of these alternatives is eliminated. The line 11x + 8y 
= 58 passes through (6, — 1) and (— 2,10) and thus does not pass through a lattice 
point in the first quadrant. The unique solution is a = 11, b = 8. 


A-6 (0). The case n = 2 shows that c is non-negative. If the ordinary mean value 
theorem is applied to x° on the interval [u,u + 1] thereis a € withu < €<u+1such 
that c &°"* =(u + 1)° — u°. For any positive integer u the right hand side is a positive 
integer. Now, in the case 0 < c < 1, u could be taken large enough so u°~! < 1/c and 
so c €°~1 <1, Thus the mean value theorem for the first derivative eliminates all c 
with 0<c<l. 

There is an extension of the mean value theorem which states that if f(x) is k-times 
differentiable in [a, b] then there is a €, a < € < b, such that h*f£) = A*f(a), where 

b—a 
h k 
unique integer such that k —1<c<k and apply this extension of the mean value 
theorem on the interval [u,u + k]. There is a € with u< € <u +k such that 


c(c — 1)(c —2)+\(c—k + 1) &-* = A*¥f(u). 


and A* is the k-th difference for intervals spaced h apart. Take k as the 


The right hand side is an integer, and by taking u sufficiently large €°~ * becomes 
sufficiently small so that the left hand side, though non-negative, is less than 1. Hence 
c(c — 1)(c —2)-:-(c —k +1) =Oandsoc =k —1. 


SOLUTIONS. PART B 
B-1 (735). Using the given laws we have 
xoy=(xoy)o(xoy)=[(xoy)ox]oy=[(yox)ox]oy 
=[(xox)oy]oy=(xoy)oy=(yoy)ox=yox, 


176 J. H. MCKAY [February 


From this commutative law we obtain 
(xo y)oz=(yoz)ox=xo(yoz). 


B-2 (314). In the given functional equation 
—1 
(1) F(x) + (==) =1+x 


—1 
~ for x, obtaining 


we substitute 


x —1 —1] 2x — 1 
(2) F (==) +F() =: 
Also in (1), we substitute — for x and obtain 
—1 x—2 
(3) F (=) + F(x) = xT" 


Adding (1) and (3) and subtracting (2) gives 


—~2 2x- 3x2 
F(x) = Lx op Sp 
(4) 


x3 —x? — 1 
B(x) = 2x(x —-1) 
That F(x), defined in (4), does satisfy the given functional equation is easily verified. 
Therefore (4) is the only solution of the problem. 


B-3 (155). At time t, car 1 has completed [t] laps and car 2 has completed 
[t — T] laps. The problem is to find values of t= T for which [t] = 2[t — T]. 

Let T=k +06, where 0S 6<1, k an integer. Consider any integral interval 
[m,m+1] and lett mSt<m+1. Then t=m+e, where OSe<1. Then the 
equation to be solved becomes 


[¢t] =m =2[t—T] =2[m+e—(k+6)] =2[m—k+e-—]. 


Thus m =2(m —k), if e26 and m=2(m —k —1), if e< 6. If 1>e 26, then 
m =2k and the equation is satisfied during [2k +6, 2k +1], which has length 
1—6. 

If O0Se<6, then m=2k+2 and the equation is satisfied during [2k + 2, 
2k + 2+ 6] which has length 6. Therefore the total length is 1-6 +6=1. 


Comment: The problem should have been more explicit by stating “‘after the 
start of the second car’’ instead of “‘during the motion’’. The solution is given for 


1973] WILLIAM LOWELL PUTNAM MATHEMATICAL COMPETITION 177 


this interpretation, whereas, if t < T, [t — T] is negative but the second car would 
have completed zero laps. 


B-4 (10). We take the radius of the sphere as unity and denote the constant sum 
PA + PB by 2a. To avoid trivial and degenerate cases we assume that 0< AB< x 
and that AB < 2a <2n — AB. 

The case 2a > x can be reduced to the case 2a < zx. For, if A’ and B’ are the 
points diametrically opposite to A and B then PA+PB=2a if and only if 
PA’ + PB’ =2n — 2a; that is, the spherical ellipses PA+PB=2a and PA’ 
+ PB’ = 2n — 2a are identical. Since min(2a, 2x — 2a) S 7m, we may assume without 
loss of generality that 2a S 7. 

Let A and B lie on the equator. There are two points V, and V, (the ‘‘vertices’’) on 


a 
the equator which lie on the spherical ellipse. Obviously, V,V, = 2a. The ‘‘center’’ 


of the spherical ellipse (common midpoint of the arcs AB and VV.) will be denoted 
by C. 


a V, V2 


Fic. 1 Fic. 2 


We first treat the case 2a <x and show that in this case the spherical ellipse 
cannot be a circle. Assume it were a circle; call it [ (see Figure 1). [ would have to 
be symmetric with respect to the equatorial plane, thus lie in a plane perpendicular 
to the equatorial plane. [ would also have to pass through the vertices. Therefore its 
spherical diameter would be VV, = 2a and its spherical radius would be equal 
to a. The spherical center of T would be C, the center of the ellipse. Let M be one of 
the two points on I which lie half-way between the two vertices. Then, since M is 
supposed to be a point on the spherical ellipse, 2a = MA + MB > 2MC = 2a (note 
that MAC is a right spherical triangle with the right angle at C and with side 
MC =a< 4x). Contradiction shows that the only possible spherical ellipses which 
are circles must occur when 2a = 7. 


178 J. H. MCKAY [February 


In case 2a = n, V, and V, are diametrically opposite points on the equator. We 
shall show that the great circle [’ through the vertices and perpendicular to the 
equatorial plane is identical with the spherical ellipse PA + PB =n. To see this, let 
B* be the reflection of B about the plane of I’. B* is on the equator diametrically 
opposite to A (see Fig. 2). Let P be an arbitrary point on the sphere, and draw the 
great circle through A, P and B*, Then PA + PB* =n. Hence, PA+ PB=n if 
and only if PB = : PBF, that is, if and only if P is on I. This shows that I is the 
spherical ellipse PA+PB= m, as stated above. 

Thus the only circles on the sphere that are spherical ellipses are the great circles. 
For any given great circle I the foci can be any two points A and B which lie on the 
same great circle perpendicular to I, on the same side of I and at equal distances 
from IT. The equation of any such spherical ellipse is PA+PB=n. 


B-5 (7). We put z =x +iy. Then both differential equations can be combined 
into one, namely 


(1) z" —iz'’ +6z=0. 


This is a standard linear equation of the second order with constant coefficients and 
has the general solution 


z(t) =c,e" +c,e° 7 


The initial conditions imply z'(0)= 0 or 3ic, — 2ic, = 0. We may set c, = 2A and 
c, = 3A, where A is any complex number. The general solution of the given system is 


(2) z(t) = 2Ae*" + 3Ae7 7" 

If A=R e™ then a rotation of axes through the angle « produces 
(3) Z(t) = 2Re*" +3Re 7" 

or in rectangular form 


Y(t) = 2Rsin(3t) — 3Rsin(22). 


This is the standard form for a hypocycloid when the radius of the rolling circle is 3R 
and the fixed circle is of radius 5R. On time reversal it becomes the standard equations 
of a hypocycloid with radius of the rolling circle of 2R and the radius of the fixed 
circle of SR. 


B-6 (52). Set 


1973] MATHEMATICAL NOTES 179 


Note that 62m+1)=2m+1, 6(2m)=6(m) and that S(2x + 1) = S(2x) +1. 
Dividing the summation for S(2x) into even and odd values of the index produces 
the following relation: 

6(2m) y 6(2m — 1) 


S(2x) = . am + an om — 1 = 4 S(x) + X. 


2 
If we denote S(x) — > by F(x), the above relations translate into 


F(2x) =4F(x), and F(2x + 1) = F(2x) + 4. 


Now induction can be used to show that 0 < F(x) < 4, for all positive integers x. 
This result is sharper than that requested. 


Acknowledgements 


The Director would like to acknowledge the assistance of the Questions Committee and the 
graders, especially Fritz Herzog, in preparing the above solutions and acknowledge the services of 
the following persons who were graders for the competition: J. C. Chipman, C. V. Coffman, S. E. 
Crick, Jr., R. A. DeVore, R. M. Dudley, W. R. Emerson, D. J. Eustice, J. Froemke, R. A. Gambill, 
L. J. Green, R. C. Hamelink, M. Hausner, F. Herzog, L. M. Kelly, B. B. Lieberman, W. S. Loud, 
D. A. Malm, E. A. Nordhaus, R. Pollack, I. Schochetman, M. E. Shanks, J. P. Williams, E. T. 


Wong. 


MATHEMATICAL NOTES 


EDITED BY ROBERT GILMER 


NV 


The present backlog for this Department is substantial. Until further notice, new manuscripts 
cannot be accepted. This moratorium will probably continue’ until June 1, 1973; authors are 
requested to hold their manuscripts pending a further announcement. 


THE AREA OF A HYPERSPHERE IN RIEMANNIAN SPACE 
B. A. Fusaro, Queens College, N. C. & National Taiwan Normal University 


1. Introduction. In Euclidean m-space the surface area Q,, of a (hyper)sphere 
with radius t is given by [1, p. 303] 


(1) Qn = Out! Om = 20? /T(m/2), 


where w,, denotes the area of a unit sphere in E”. For example, from the value 
T(4) = ./x we get the familiar w, = 27 and w3 = 4z. 
We shall consider a Riemannian m-space and ask: Is the area Q,, of a sphere 


180 B. A. FUSARO (February 


of fixed radius independent of the location of the sphere in the space? Just as a 
physicist employs the concept of a test electric charge, or magnetic pole to probe 
physical space, so a mathematician can arm himself with a test sphere to examine 
geometrical space. More precisely, we associate with each point P of m-space a 
closed spherical neighborhood of radius t and (in order to avoid self-intersecting 
or “‘incomplete’’ neighborhoods) consider only those that are homeomorphic to a 
closed m-sphere in E”. We then ask: Is Q,,=,,(P,t) independent of P for every 
fixed t? 

In E™ the answer is evidently yes. Another space that is of interest is the space 
K™ of constant curvature c. This space, which is isometric to E” when c = 0, has 
a somewhat involved definition via a curvature tensor. However, for c>0 we can 
interpret K™ as the surface of an (m + 1)-sphere of radius 1/,/c in E”*+ (see [3]). 
It follows from the geometry that the area of a sphere is independent of its location 
in a space of positive constant curvature. In fact, the independence property holds 
for all c with 


[sin(t/c)/./c]"* 0<c<(x/t) 
(2) m = On 


[sinh(t./—c)/,/—c]"-2  ¢ <0. 

Fulton [3] has recently given a simplified derivation of (2). 

The Harmonic spaces H™ of Copson and Ruse (1939) enter the scene quite nat- 
urally as the next superspace. These spaces specialize to K” for m = 2, 3. [See 5] 

It will be shown in this note that it follows readily from known results in H™ 
theory, the use of Riemannian normal coordinates, and a simplified definition (3) 
of Q,, that spheres in H™ do have this independence property. The converse property 
is discussed in the last section. 


2. Riemannian space and Riemannian normal coordinate systems (RNCS). We 
begin with an m-dimensional differentiable manifold of class C”. Briefly this is a 
Hausdorff (T—2), locally Euclidean, connected space that, via suitable homeo- 
morphic maps into E”, allows at each point the erection of a coordinate system (CS) 
and C” transformations to other CS [4 or 6]. We introduce on the manifold a CS 
and convert it to a metric space via the symmetric, positive-definite, C? quadratic 
form 


ds* = g,(x)dx'dx’, x = (x',x7, +++, x”). 


The convention of summing from 1 to m on repeated indices is used throughout 
this note. The CS and metric will be denoted by 


(x; J/g); § = det(g;;). 
A transformation from (x;./g) to (y; /h) is given by 
y= fix), hig = (x7/0y') (0x4/0Y)8 pq. 


1973] MATHEMATICAL NOTES 181 


Positive curvature 1/a? Negative curvature—1/a2 
QO». = 2z-a sin (t/a) Q»2 = 22 - a sinh (t/a) 


Mixed curvature 
OQ = 2n-/eO) 


The metric is given by r = |Q — P| = min { 2 (ds/dt)dt, where the minimum is 
taken over all continuously differentiable parametrized arcs x = x(t) connecting P 
to a neighboring point Q. A minimizing arc is a geodesic. A (geodesic hyper) sphere 
in this m-space can then be defined as in Euclidean space. 

A coordinate system can be chosen so that ds? = 6,,dx'dx/ if and only if the 
space is E”. However, it is always possible on a class C° differentiable manifold to 
choose a system (y;./h) so that h,; = 6,; at the origin and so that equations of 
geodesics issuing from the origin have the linear form y' = f'r, with |B| = 1, 
and where r denotes geodesic distance measured from the origin. This system is 
known as a Riemannian normal coordinate system, abbreviated RNCS. A property 
of a RNCS is that h,;y' = 6,,y' along a geodesic so that the square of the distance 
from the origin measured along a geodesic takes the form r?= 6,,y'y/. See [4, pp. 
149, 307] for a full discussion. These special coordinate systems are very useful 


182 B. A. FUSARO [February 


because they put many of the results of Riemannian geometry in familiar Euclidean 
form. 


3. The surface Q,, of a sphere in Riemannian space. Consider, in a Riemannian 
m-space, a sphere with center P and radius t, and let Q denote a variable point of 
the sphere. In the system (x;,/g) the volume of this sphere is 


(m) 
| Jedx P= P(), 0 =O). 


x—-€é[<t 
The area of its surface will be defined by the equation 


(m) 


(3) Q,, = O_(P,t) = dfdt | Jedx. 
[x2] <1 

If the space is Euclidean then (3) reduces to (1). This definition by-passes the 
difficulties attending the parametrization of a surface. 

We shall assume that (x;,/g) is referred to a RNCS with origin at P and choose 
€ = 0. The indicated differentiation in (3) can be explicitly carried out after a trans- 
formation to a geodesic polar system (r, 0; Jy). If we interpret our RNCS as rectang- 
ular coordinates, this transformation takes the form of the usual one for polar 
coordinates [2, p. 65]. The typical case m = 4, written in subscript notation, is 


x, = rsin@,sin@, cos 6, 
X4 = rsin 6, sin 8, sin 8; 
x, = rsin@,cosé, 


x, = rcosé, 


with 6; in [0,2] for i = 1,2 and 6, in [0,2z). Here r denotes the Euclidean distance 
from the origin P and also represents geodesic distance |Q — P| in the original 
RNCS. The angles 6, are measured at the origin and have their usual Euclidean 
meaning. 

The Jacobian J of the above transformation has the form J = r"®(@), and 
satisfies the relation ./y = | J|./g. The volume integral over the sphere | x| <t 
can now be written as 


(m) m—1) 


(m) (m) t ( 
Jgdx = | Jydrd@ = | /g| J| drdé = [er dr J/g 0(6)d0. 
0 


A Harmonic space H™ can be characterized by the property that g is radially 
symmetric [5, p. 35]. That is, the coordinate variable x and the parameter é enter 


g = det(g;,) only via r = | P — Q| , so that one g-function serves for the whole 
space. From definition (3) we have 


1973] MATHEMATICAL NOTES 183 


(m—1) 


t (m-1) __ 
Q,, = d/dt { J/g ir dr } (6)do = t”~'./g(t) } 0(0)d0. 
0 
The angle integral in this expression for Q,, is the ordinary E” unit surface element 
W, in (1), as is seen by letting H” = E” and choosing t = 1. Therefore, 
(4) Q,, = /g(t)t"-'w,, in H™ (referred to a RNCS), 


so that in a Harmonic space the area of a sphere of given radius is independent 
of its location in the space. If the space is of constant curvature, then 


Ja(t) = [sin(t/o/(t/e)]"™* 


and the above expression for Q,, reduces to (2), as it should [5, p. 30]. 
Can the above argument be reversed to show that if a test sphere has the in- 
dependence property then g is radially symmetric so that the space is Harmonic: 


Qn(P, 1) = Q(t) > g(r, 8) = g(r)? 


No, because it can happen that the angle @ enters g at each point P in such a way 
as to be integrated away. 


4. The mean-value as an even function of the radius. Let M = M(t,¢;f) denote 


the mean-value of a continuous function / averaged over a sphere with radiust 
and center P(é) 


(m—1) 
0,°M = } f(@)dS,, |x-—éE] =t. 
After a transformation to a RNCS with origin at P, this expression takes the form 
0,°M = pie + ates, x=€E+at, | o | = 1, 


which is defined for negative t. Now assume the space is Harmonic and apply equa- 
tion (4) to get 


On’ M = $f (E+ a0\da, | o«| = 1. 


The usual Euclidean argument then yields that M is an even function of t. 


5. A tempting conjecture. If a test sphere indicates that a space has the inde- 
pendence property, is that space Harmonic? The tempting affirmative conjecture, 
if correct, yields a simple geometric characterization of H”. This converse question 
does not appear to be an easy one to answer. It is worth knowing whether the 
answer is yes even for a space K”™ of constant curvature, especially because H” = K™ 
for the cases m = 2, 3, as was remarked earlier. For the case m = 2, at least, the 
answer is yes. If Q, is independent of the center P of the sphere, it follows from 


184 J. L. ULLMAN [February 


the Puisex-Bertrand formula [4, p. 151] that for curvature c(P) 
nt: c(P) = 3 lim (Qat —Q,)/t? 
t70 


that the space is of constant curvature. 


Supported in part by NSF grant GP 1834. 


References 


1. R. Courant, Differential and Integral Calculus, vol. II, Interscience, New York, 1964. 

2. L.E. Blumenson, A derivation of n-dimensional spherical coordinates, this MONTHLY, 67 
(1960) 63-66. 

3. C. M. Fulton, Hyperspheres in spaces of constant curvature, this MONTHLY, 76 (1969) 43-44. 

4. E. Kreyszig, Differential Geometry and Riemannian Geometry, University Press, Toronto, 
1968. 

5. H.S. Ruse, A. G. Walker, T. J. Willmore, Harmonic Spaces, Edizioni Cremonese, Rome, 1961. 

6. T. J. Willmore, An Introduction to Differential Geometry, Oxford University Press, London, 
1959. 


AN AREA THEOREM FOR SCHLICHT FUNCTIONS 
J. L. ULLMAN, University of Michigan 


The theorem proved in this paper was observed by Prof. H. Alexander, of the 
University of Michigan, to be a consequence of theorems concerning functions 
of several complex variables found in Rutishauser [2, p. 257, p. 259]. The content 
of the theorem states a simple and, we believe, interesting property concerning 
univalent conformal maps, and we feel that a proof based on tools on one-variable 
complex analysis will be of interest. 


THEOREM. Let w = f(z) be analytic in the domain X, = {z: | z| <1}, univalent, 
and let f(0) = 0. LetX,, = {w:|w| <1}, let S =f(Z,) NX,, and let A = f~*(S). 
It is then true that (a) 


(1) Area (A) + Area (S) = 2 
and (b) that no larger constant can be used on the right side of (1). 


Proof. An example shows that (b) holds. Namely, let w = Az, | A <1. Then 
we have S = {w:|w| <|A|}, 4=2,, amd Area (A) + Area (S) = (1 +|a[?)z. 
Thus (b) is established since || can be made arbitrarily small. 

The proof of (a) is divided into two cases. In Case I, we assume that f(z) is 
analytic and univalent on the set Z, = {z:|z| S 1} and in Case II, the general 
case is considered. 


1973] MATHEMATICAL NOTES 185 


Case I. If c, = {z:|z{ = 1}, then f(,) is an analytic Jordan curve, and either 
intersects o,, = {w:|w| = 1} a finite number of times or coincides with o,,. In the 
latter case, f(z) = az, | oe| = 1, and the theorem is true, so we consider the first 
situation. The set S need not be a connected set, but Sp, the component of S containing 
w = 0, is simply connected and bounded by a simple piecewise analytic curve, 
T(So). We only use the fact that ['(S,) is a Jordan curve. Thus Ap = f~'(So) is 
a simply connected subset of X, containing z = 0, and is bounded by a Jordan curve 
I'(Ap). Since So <c S and Ag c A, it is sufficient to prove 


(2) Area (Ay) + Area (So) 2 7. 


Because of the stated properties of Ag, we know by the Riemann mapping theorem 
and Caratheodory’s theorem on the boundary behavior of the mapping function 
(Hille [1, p. 320, p. 360]), that there is a function h(t) analytic in 2, = {t: | t| <1} 
and continuous in 2, = {t: | | <1} such that x, is mapped univalently onto Ao 
and o, = {t: | t| = 1} is mapped univalently onto I'(Ag). Furthermore, the function 
g(t) = f(h(Q)) is analytic in Z, and continuous in E, and maps &, univalently onto 
So and o, univalently onto [(So). Thus if h(t) = 27.,a,t” and g(t) = 2,7, b,t", 
we have (Hille [1, p. 360]) 


(3) Area (A,) + Area (So) = 2 5 n(|a,|? +|5,|?). 
n=1 


Since h(t) and g(t) are continuous on o,, we also have the relations (Hille 
[1, p. 360]) 


1 2n ; a0 1 2m 0 
(4) 50 { | n(e'*)|?d0 = Zaks | jg(e")|7d0 = & | b,|?. 


If e, = {e*: | h(e®)| = 1}, e. = {e'*: | g(e*)| = 1}, we find that 


1 7%) aia ay 5 meas (er) 
ag |, [meM/aa = RECs 


V 


(5) 


meas (e,) 
2n 


INV 


1 20 ‘6 
x i) | g(e")|7d0 
where meas (e,) indicates the linear Lebesgue measure of 


{0:e"%ce,,0SOS2n}, jf =1,2. 


We use the fact in (5) that | h(e"*)| = 0 and | g(e"*) | = 0. Once we show 
(6) meas (e,)+ meas(e,) 2 27, 


the combination of (3), (4), (5), and (6) yields (2), so we proceed to the proof of (6). 
If meas (e,) = 27, (6) follows, so we consider the case that this is not so. There 


186 O. T. ALAS [February 


will then be a point on o, not in e,. If we can show that such a point must be in e,, 
(6) will hold. Assume then that e'¢ e,, so that | h(e’)| <1. Now g(e™) is a point 
of ['(So). The set [(S9) consists of arcs of o,, and arcs of f (o,). Assume next that 
e'¢ e, . This means that | g(e"**) | < 1. Thus g(e*') is a point of f(¢,). On the other 
hand, g(e’*’) = f(h(e")), and since | nce’) | <1, f(h(e'®)) is an interior point of 
f(&,) and hence cannot be a point of f(¢,). This contradiction completes the proof 
for Case I. 


Case II. If w = f(z) is analytic in the domain %,, univalent and satisfies f(0) = 0, 
then /f,(z) = f(rz) satisfies the requirements of Case I when 0<r<1. Thus if 
S, = f(z,) NX, and A, = f.~*(S,), we have by (1) 


(7) Area (A,) + Area (S,) = 7. 


It remains to show 
(a) lim Area (S,) = Area (S) and (b) lim Area (4,) = Area A 
rT rT t 


and the proof is complete. Now S, is an open set, increasing as rt 1 and exhausts S, 
thus establishing (a). In addition 


f-\S,) = =f 71(S), 


and so (b) is established by letting r tend to one, and the proof of Case IT is complete. 
The methods of this paper have been enlarged upon, and have led to the following 
theorem which will appear in [3]. 


THEOREM. If f(z) is continuous in ,, analytic in Z, and satisfies f(0) = 0, 
then 4 6" | f(e")|7d6 < A(D), where D = f(x.) and multiplicity is not counted 
in measuring area. 


References 


1. E. Hille, Analytic Function Theory, Vol. Il, Ginn and Company, New York, 1962. 

2. H. Rutishauser, Uber Folgen und Scharen von analytischen und meromorphen Functionen 
mehrerer Variabeln, sowie von analytischen Abbildungen, Acta Mathematica, 83 (1950) 249-325. 

3. H. Alexander, B. A. Taylor and J. L. Ullman, Areas of Projections of Analytic Sets, Inventio- 
nes Mathematicae, Fasc. 4., 16 (1972) 335-341. 


ON SET POINTS OF DISCONTINUITY 
O.T. Aas, University of S40 Paulo 
In this note we shall prove a theorem which is a generalization of a well-known 
theorem on metric spaces [2]. 


THEOREM. Let X be a topological space, (Y, Y) be a Hausdorff uniform space 
of weight m, (f,) be a sequence of continuous functions of X into Y and f be a 
function of X into Y such that f(x) = limf,(x) for every xe X. Then the set of 


1973] MATHEMATICAL NOTES 187 


points of discontinuity of f is the union of at most m (No) nowhere dense subsets 
of X if m is infinite (respectively, if m is finite). 


Before turning to the proof, let us recall some definitions. The weight of a uni- 
form space (Y,%) is the least cardinal number such that the uniformity WY has a 
basis of this cardinality. We may assume (see [1] page 186) that the elements of this 
basis are symmetric, closed subsets of the product space Y x Y. A subset of X is 
nowhere dense if the interior of its closure is empty. 


Proof of the theorem. Our proof follows that which appears in [2]. 

Let us denote by B a basis of the uniformity Y%, whose cardinality is m and whose 
elements are symmetric closed subsets of Y x Y. 

For each UcB, let us denote by D(U) the set of the points x eX such that 
f(V) x f(V) — U # @ for every neighborhood V of x. Thus, denoting by D the set 
of the points of discontinuity of f, we have that 


D = U {D(U)| UeB}. 
We shall prove that each set D(U) is the union of a countable number of nowhere 
dense subsets of X. 


Fix UeB and WeB such that Wo Wo WceU. For each natural number 
k 21 put 


A, = {xeX | (F,(x),f.0)) € W, Yn & kh. 


The identity X = U {A,|k =1,2,---} holds by virtue of the convergence of the 
sequence (f,). Each set A, is closed because the /, are continuous, and W is closed 
in Y x Y. Thus D(U) = U {D(U) NA, | k = 1,2,---} and every set D(U) N A, is 
nowhere dense. Indeed, if x belongs to the interior of A,, since f, is continuous, 
there is an open neighborhood G of x, contained in A,, such that f,(G) x f,(G) < W. 


If y,z eG, then (f(y), A.) W, FS), f(2) € W, and (£,(1y),fi(Z)) € W. So (f(y), f(z) € U 
and x does not belong to D(U). The proof is completed. 


References 


1. N. Bourbaki, Topologie Générale, livre 3, chapitres 1 & 2, Hermann, Paris, 1965. 
2. K. Kuratowski, Sur les fonctions représentables analytiquement et les ensembles de premiére 
catégorie, Fund. Math., 5 (1924) 75-86. 


GENERALIZED FIBONACCI NUMBER TRIPLES 


A. G. SHANNON, New South Wales Institute of Technology, Sydney, Australia, and 
A. F. HorapaM, University of New England, Armidale, Australia and University of Reading, England 


1. Introduction. It is possible to relate the results of Teigen and Hadwin [5] to 
the generalized sequence of numbers, {w,}, investigated by Horadam [3], and, at 


188 A. G. SHANNON AND A. F. HORADAM [February 


the same time, to generalize the Fibonacci number triples related to {H,,} previously 
studied by Horadam in this MonTHLY [1], [2]. 
{w,,} satisfies the general second order recurrence relation 


(1.1) Wa = PWa-1 — UWy-2 (NS 2) 


with general initial conditions wo = a, w, = b, and where p and gq are arbitrary 
integers. When p= — q = 1, {w,} = {H,}. 


2. Lemmas. 

(2.1) (D* — Q)Wa+2 — PWae3 = TW 

(2.2) (Dp? — 4) Wao + PWat3 =2(D? — 4) Wasa — Wwe 
Proof of (2.1). (Dp? — 4) Wa+2 — PWa+3 


= (DP? — 4) Wa42 — P’Wat2 + PAWnas by (1.1) 
= = PQWas1 + Q7?Wa + PQWn+1 by (1.1) 
which gives the required result. 
The proof of (2.2) follows immediately from (2.1). 
3. Theorems. 
(3.1) {(p/q7)Wy Wat 3}? + {2PW,42(PWa42— Wad}? = {Wa + 2PW,+2(PWa+2— Wad}? 
where P = (p” — q)/2q’. 


The three numbers in the Pythagorean-type formula (3.1) are called a generalized 
Fibonacci triple. 


(3.2) All Pythagorean triples are generalized Fibonacci triples. 
Proof of (3.1). Multiply the corresponding sides of (2.1) and (2.2): 
(p? — q) Wa? — DWats = 2(p* — 4) q?WyWn+2 — q*w,- 
Divide through by q* and rearrange to obtain 
{(p/q7) Wass}? = Wa + 4PWa2 (PWas2 — Wn): 
Multiply through by w? and add 4P?w7,. (Pw,+42—W,)* to each side: 
{(p/q7) WiWnrah + {2PWas 2(PWas 2 — W,)}? 
(wa)? + 2{2PWy42(PWas2 — Wad} (Wa) + {2PWa2(PWa42 — Wad}? 


{w? + 2Pwis2(PWar2—W,)}?, aS required. 


Proof of (3.2). Put a= t(x — y), b = t((1 + Pq)x — Pqy)/Pp in {w,} to obtain 


1973] MATHEMATICAL NOTES 189 


the sequence 


(3.3) t(x — y), (1 + Pq)x — Pqy)/Pp, tx /P,t(x + y)q? /p,... 


For n = 0, (3.1) and (3.3) give t#(x? — y*)? + (2t?xy)? = t4#(x? + y)?, which proves 
the theorem. 


4. Examples. When p= —q=1, P=1, and (3.1) reduces to 
(4.1) (Hy Hn+3) + (2An41 H+ 2)” = (An + 2Ans1 Ans 2)? 
which is equation (3) of [2]. (3.1) in fact agrees with equation (2.2) of [3], namely, 
[(pWaa — QW)? — Ward? + [2Wn41(PWa41 — GWa)l? 
= [(DWa41 — Wy)? + Waal? 


but (3.1) above is in a form which can be generalized for recurrence relations of order 
higher than two as in Shannon and Horadam [4]. 

More specifically, when t = 1, and p = — q = P = 1, the formula in the proof of 
(3.2) leads to 


(x? — y*)? + (2xy)? = (x? + y’)’ 


which gives primitive pythagorean triples. For example, when x =2, y=1, we 
obtain the primitive triple 3, 4, 5 from the Fibonacci sequence 1, 1, 2, 3,... (derived 
from (3.3)); when x = 3, y = 2, we obtain the primitive triple 5, 12, 13 from the 
sequence 1, 2, 3, 5,... Of course, if t = 3 (say) and p = — q = P = 1, we obtain the 
nonprimitive triple 9, 12, 15 [=3 (3, 4, 5)] from the sequence 3, 3, 6, 9, ... 
{= 3 (1, 1, 2, 3, ...)]. 


5. Methods of Teigen, Hadwin and Horadam. We conclude by showing how (4.1) 
and the method of Teigen and Hadwin are related. Teigen and Hadwin proved that a 
Pythgorean triple (a, b,c) can be represented by 


(5.1) a=x+z,b=y+z,c=x+yt+z, 
where x, y, z satisfy 
(5.2) x,y,z are positive, 2xy = z”, z is even. 


If we set x = H?, z = 2H, H,41, then z” = 2H?(2H?,,), so that y = 2H?,, from 
(5.2). If we use (5.1) and (1.1) with p = — q = 1, we find that 


a= H; + 2H, Aya =H, +35 
b= 2+ + 2H, Aya = 2A +1 A402; 
c= H, + 2Hn+1 + 2H, Any. = H; + 2A +1 An+2- 


Thus (a, b,c) is related to the Fibonacci triple of (4.1). 


190 CLAIRE PARKINSON [February 


References 


1. A. F. Horadam, A generalized Fibonacci sequence, this MONTHLY, 68 (1961), 455-459. 

2. , Fibonacci number triples, this MONTHLY, 68 (1961), 751-753. 

3. , Special properties of the sequence w,(a, b; p,q), Fibonacci Quart., 5 (1967) 424-434. 

4. A. G. Shannon and A. F. Horadam, A generalized Pythagorean theorem, Fibonacci Quart., 
9 (1971) 307-312. . 

5. M. G. Teigen and D. W. Hadwin, On generating Pythagorean triples, this MONTHLY, 78 
(1971), 378-379. 


AMBIVALENCE IN ALTERNATING SYMMETRIC GROUPS 


CLAIRE PARKINSON, Burlington, Vermont 
Referring to Higgins and Ballew [1, p. 274] we have the following 


DEFINITION. An ambivalent element of a group is one which is conjugate to its 
inverse; an ambivalent group is one all of whose elements are ambivalent. 


Certain groups are readily seen to be ambivalent, such as all full symmetric and 
all dihedral groups. Others are readily seen to be nonambivalent, such as all odd 
order groups of more than one element. In this article the ambivalence of alternating 
symmetric groups will be examined, with the result reached that the only such 
ambivalent groups are A,, Az, As, Ag, Ajio, and A,4. Throughout, S, denotes the 
full symmetric group on n letters and A, its subgroup consisting of all even permu- 
tations. 


Lemma. S, is ambivalent. 


Proof. Immediate from Herstein [2, pp. 75-76] since all elements in S, with the 
same cycle decomposition are conjugate. 

The lemma establishes that each element x € A, has a conjugating element téS, 
taking x > x~*' =1t7'xt. To show the ambivalence of x in A, we must show that 
some such t is in A, as well as S,. 


THEOREM 1. The nonambivalent elements of A, are precisely the elements x with 
cycle decomposition {1,, I,,---,1,} satisfying the following three restrictions: 


1. Each I, is odd. 
2. L,=1,>1=j. 
3. 4(n — m) is odd. 
Proof. The method of proof will be to show first (a) if x is nonambivalent then 1 
and 2 hold and second (b) if 1 and 2 hold then x is nonambivalent if and only if 3 
holds also. 


(a) If some J; is even, let (x,X ++ X24) be any cycle of even order in the disjoint- 
cycle representation of x. Select te S, such that t~'xt = x~'. Then 


1973] MATHEMATICAL NOTES 19] 


[0x4 0+ X2g lt ]7 x [C4 ++ Xag)t] = tO (age 1) x (Ky Keg) t = tT xt = x7}; 
and clearly precisely one of the two elements t and 
(X05 Xaqg)t = (%1X2) (%4Xx3) +++ (Ky X2,)t 


is in A,. But that makes x ambivalent in A,. Thus for x to be nonambivalent, each 
I, must be odd. 

Now assume x is nonambivalent but Restriction 2 is not satisfied. Let (x,x,°--x,) 
and (y1;¥2°'' ¥y) be two different cycles of the same order in x’s representation. From 
the above, w is odd. Selecting any t¢S, such that t~!xt = x~1, then 


[(x1y1) a (XwYwt]7 *x[(%191) a (XV dt] =t-'xt=x7', 


Again, precisely one of t and (x,y,):::(x,y,,)t is in A,, making x ambivalent in A,,. 
Thus Restriction 2 must be satisfied whenever x is nonambivalent. 

(b) Finally, let x = (404 .%4,2°°+ %1,17,)(%2,1.°7* ¥2,1,) 7* %m.4 *** Xmr,,) be any element 
satisfying the first two restrictions. Then 


xt = (Xin, Ig, % mI — 1 a Xm,1) —_ (Xi, X14). 


Since for any cycle y and any element téS,, t~'yt is a cycle of the same length as y 
[2, p. 76] and since 


t-!xt = t~ "(x44 Xa rte t~ "(X24 Xo p)brr t—"(Xm,1 Xm Tides 


Restriction 2 necessitates that any conjugating element t sending x > x~! must take 
each of the m cycles to its inverse. By Restriction 1, each of the elements of S,, 
conjugating (x; 1°::X;,;,) to its inverse has a disjoint-cycle representation consisting 
of precisely (I; — 1)/2 transpositions. There are I; such conjugating elements in S,,. 
This yields | |7.,1; conjugating elements in S, taking x x71, each having a 
disjoint-cycle representation with dj"., 47; — 1) =4(n — m) transpositions. These 
[]fi1 1; elements exhaust all such conjugating elements since 0(S,)/| [7 1 I; is indeed 
the order of x’s conjugate class in S,. Thus either 4(n—m) is even and all 
[ [7 1; conjugating elements taking x > x~* are in A,, or 4(n — m) is odd and none 
of the conjugating elements is in A,. In the first case x is ambivalent, in the second x 
is nonambivalent. 


THEOREM 2. All alternating symmetric groups A, are nonambivalent except 
A;; A), As, Ag, Ao: A 14: 


Proof. Case 1: n= 4t where t is an arbitrary positive integer. Here the element 
(x4 °*+ X4;-1) 18 contained in A,, but is nonambivalent by Theorem 1, its cycle decom- 
position being {4¢ — 1,1} and 4(n — m) being 4(4t — 2) = 2t — 1. 


Case 2: n= 4t — 1. As in Case 1, (x, °++ X4,-1) iS in Ag,_, and is nonambivalent, 
with 4(n — m) = 4((4t — D) — 1) =2t-1. 


192 RONALD ALTER [February 


Case 3: n= 4t+1. The element (x,x 2x3) (%4°°: X4,) € Ag;41 1S nonambivalent 
provided the cycle (x, --- x4,) has length greater than 3. Thus A,,,, is nonambivalent 
for t> 1. 


Case 4: n=4t+2. Here (x,x2x3) (%4°°'Xg) (X9°°* X4;41) 1S nonambivalent 
provided 4¢+ 1 = 15. 
' The above four cases show that all A, with n ¥ 1,2,5,6,10, or 14 are nonambiva- 
lent. The ambivalence of the remaining groups follows a quick survey that no element 
of those groups can satisfy the restrictions of Theorem 1. 


References 


1. Robert Higgins and David Ballew, An equation for finite groups, this MONTHLY, 78 (1971) 
274-275. 
2. I. N. Herstein, Topics in Algebra, Blaisdell, New York, 1965. 


RESEARCH PROBLEMS 
EDITED BY RICHARD GUY 


In this Department the Monthly presents easily stated research problems dealing with notions 
ordinarily encountered in undergraduate mathematics. Each problem should be accompanied by 
relevant references (if any are known to the author) and by a brief description of known partial 
results. Manuscripts should be sent to Richard Guy, Department of Mathematics, Statistics, and 
Computing Science, The University of Calgary, Calgary 44, Alberta, Canada. 


CAN $(n) PROPERLY DIVIDE n—1? 


RONALD ALTER, University of Kentucky 


The @ in the title is the Euler function ¢(x) denoting the number of natural 
numbers < x which are relatively prime to x. It is well known and easy to show that 
g(n) =n — 1 if and only if n is a prime. The question of whether or not there exists 
a composite integer n and an integer k > 1 so that the equation 


(1) ko) =n-1 


has a solution, was first raised by D. H. Lehmer [2]. Lehmer proved that if such an n 
exists then it must be odd, square free, and the product of at least seven distinct prime 
numbers. Bateman and Kohlbecker [1, p. 238, exercise 14] raise the question again 
when they state that there are no known examples for which m is not a prime and 
@ (m) | (m — 1). Marshall [4] asks for a proof that equation (1) has no solutions for 
any natural number k, where 7 is an odd square-free natural number such that no 
prime factor divides the Euler function of any other prime factor. 
Lehmer [2] also proves the following two theorems. 


1973] CLASSROOM NOTES 193 


THEOREM 1. If pis afactor ofn, then n contains no prime factors of the form px + 1. 


THEOREM 2. If n is a solution of equation (1) for k = 3, then n is a product of 
more than 32 distinct prime factors. 


Schuh [5] claimed that ifmis composite it consists of at least 11 distinct primes. He 
proved the following theorem. 


THEOREM 3. Jf 3 | n then k is of the form 3x + 1. 


Recently, Lieuwens [3] extended Lehm:r’s main result by proving that n must be a 
product of at least 11 distinct odd primes. He also proved the following two theorems. 


THEOREM 4. If 3 | n then n is the product of more than 212 prime numbers and 
n> 5.5 x 10°”, 


THEOREM 5. If the smallest prime factor of n is = 7 then n is the product of at 
least 13 primes. 


With current computing facilities one would think these results could be extended. 
However, a proof of the conjecture that equation (1) has no solutions for k > 1 still 
appears to be extremely difficult. 


References 


1. E. Landau, Elementary Number Theory (transl. by J. E. Goodman with exercises by P. T. 
Bateman and E. E. Kohlbecker), Chelsea, New York, 1958. 

2. D. H. Lehmer, On Euler’s totient function, Bull. Amer. Math. Soc., 38(1932) 745-751. 

3. E. Lieuwens, Do there exist composite numbers M for which k@(M) = M — 1 holds? 
Nieuw. Arch. Wisk (3), 18(1970) 165-169. 

4. A. Marshall, Problem E 2237, this MONTHLY, 77(1970) 522. 

5. F. Schuh, Can n — 1 be divided by d(m) when n is composite? (Dutch), Math. Zutphen B, 
13 (1944) 102-107. 


CLASSROOM NOTES 
EDITED BY ROBERT GILMER 


Manuscripts for this Department should be sent to Robert Gilmer, Department of Mathematics, 
Florida State University, Tallahassee, FL 32306. Notes are usually limited to three printed 


pages. 
A DISCOVERY APPROACH TO e 


J. P. TuLL, The Ohio State University 


While teaching the first year course in mathematics at the University of Zambia 
in 1970 and 1971, I came up with the following approach to exponential and loga- 
rithmic functions. Itis perhaps novel in one small way. Namely, once we had defi- 
ned the exponential and logarithmic functions to an arbitrary base, starting with the 


194 R. B. DARST 


usual a?/4 = 4/ap, we eventually came to the question of the derivative. As usual 
we found 
(a**" —a*)/h = a*(a" —1)/h 


and so we needed only find the derivative at 0. 

We noticed, by looking at graphs, that the derivative at 0 increases as a increases, 
and there are large values and small values. There ought to be one particular base, 
call it e , for which this derivative is 1 (Needless to say, we knew very little about 
continuous functions at this stage.) 

For this base e , since the graph of log is the reflection in y =x of the graph of 
exp, then log’1 = 1. This means that 
(*) (1/5) log(1 +6) > 1 


as 6 > 0. We readily find that log’x = 1/x by the direct approach, using (*). But 
also from (*) we see that as 6 — 0 
e(1/d)log(1 +5) > e} = e: 


ie., (1 +6)'° +e as 530. Thuse = lim(1 + 1/n)"= lim (1 — 1/n)~". 


n-> oo n- oo 


SIMPLE PROOFS OF TWO ESTIMATES FOR e 
R. B. Darst, Colorado State University 


Let a, = [1+ (1/n)]"*”, b, = [1 —/n)]", and c, = [1 + (1/n)]", n = 1,2,- 

In elementary calculus classes one frequently establishes (C): the sequence{c,} 
increases (c, < C,+1) 3 Sometimes one also shows (A) : {a,} decreases, and (B) : {b,,} in- 
creases. Then lima, = limc,, and one can show that this common value is e. 

We shall give simple proofs of (A), (B) and (C) based on the fact that 
(1 + x)"*) > 1 +(n + 1)x when x > 0 and n is a positive integer. Thus, to establish 
(A), notice that 


(Af 4y41) = [+ Din" {LH t+ 2a + HP"? 1 + In + YY} 

= [(n + 1)?/(n? +2n)]* [1 + 1/(n + 19] 

= [1+ 1/(n? + 2n)]°"? /[1 + 1/n+ 1] 

> [1+ tnt t?y"*? [14+ 1/mt+)] > 1. 

Since by,41) = 1/d,. (B) is also established. Finally, 
(Cn+1/¢y) = [a t+ 2)/(n + DY" {Lr + D/P [nln + 107} 

= [(n? + 2n)(n + 1)7]"*? /[1 — 1/4 1)] 
= [1-1 +1 yf - 1f(n + 1)] 


(n+ 
= [bins 1)2/bana1y] POS 1, 


PROBLEMS AND SOLUTIONS 
EDITED By Emory P. STARKE 


ASSOCIATE EDiTorRS: JOSHUA BARLAZ, Eric S. LANGFORD. COLLABORATING Epitors: LEONARD 
CARLITZ, GULBANK D. CHAKERIAN, HASKELL COHEN, S. ASHBY FoorTe, ISRAEL N. HERSTEIN, 
Murray S. KLAMKIN, DANIEL J. KLEITMAN, ROGER C. LYNDON, MARVIN MARCUS, CHRISTOPH 
NEUGEBAUER, ALBERT WILANSKY, AND UNIVERSITY OF MAINE PROBLEMS GROUP: EARL M. L. 
BEARD, GEORGE S. CUNNINGHAM, CLAYTON W. DopGez, Howarp W. Eves, OSKAR FEICHTIN- 
GER, WILLIAM R. GEIGER, GARY HAGGARD, PuiLip M. Locke, JOHN C. MAIRHUBER, CURTIS 
S. Morse, GRATTAN P. MurpuHy, EDwarp S. NorRTHAM AND WILLIAM L. Sou_g, Jr. 


All problems (both elementary and advanced) proposed for inclusion in this Department should 
be sent to E. P. Starke, 1000 Kensington Ave., Plainfield, NJ 07060. Proposers of problems 
are urged to enclose any solutions or information that will assist the editors. Ordinarily, prob- 
lems in well-known textbooks and results in generally accessible sources are not appropriate 
for this Department. No solutions (except those accompanying proposals) should be sent to 
Professor Starke. 


ELEMENTARY PROBLEMS 


Solutions of Elementary Problems should be sent to Problems Group, Mathematics Department, 
University of Maine, Orono, ME 04473. To facilitate their consideration, solutions of Elemen- 
tary Problems in this issue should be typed (with double spacing) and should be mailed before 
May 31, 1973. Contributors (in the United States) who desire acknowledgment of receipt of 
their solutions are asked to enclose self-addressed stamped postcards. 

An asterisk (*) means neither the proposer nor the editors supplied a solution. 


E 2397." Proposed by K. A. Brons, Cherry Hill, New Jersey 


The ellipse has the property that corresponding to any point on it there exist two 
other points on it such that the tangent to the curve at any of the three points is 
parallel to the chord joining the other two. Do any other simple closed convex planar 
curves enjoy this property? 


E 2398. Proposed by C. W. Dodge, University of Maine at Orono 


Prove that the point of intersection of the diagonals of a parallelogram lies on 
the pedal circle for any vertex with respect to the triangle formed by the other three 


vertices. 
E 2399. Proposed by D. E. Daykin, University of Reading, England 


Let T be an affine transformation of the plane; that is, Tx = Ax + b, where A 
is a2 x 2 real matrix and b is a fixed vector. Characterize those affine transformations 
which have no fixed points and for which TL ¢ L for no line L. 


E 2400. Proposed by A. W. Walker, Toronto, Canada 


If O, H, I, r, R are the circumcenter, orthocenter, incenter, inradius and cir- 
cumradius of any scalene triangle T, and P (defined as a limit point if T is right- 


202 


ELEMENTARY PROBLEMS AND SOLUTIONS 203 


angled) is the orthocenter of the pedal triangle of H, then line OJ divides line segment 
PH internally as r: R. 


E 2401.* Proposed by V. F. Ivanoff, San Carlos, California 


The exterior angle bisectors of a convex polygon Py form a polygon P,, whose 
exterior angle bisectors form a polygon P,, and so on. Prove that P, approaches a 
regular polygon as n— oo. 


E 2402. Proposed by David Newman, Hartford, Connecticut 


Let n >7 be an integer and let N = {2,3,4,---,2n}. Show that N has precisely 
n+ 5n-element subsets S with the property that if i and j are distinct elements of S, 
then i+j¢S. 


SOLUTIONS OF ELEMENTARY PROBLEMS 


A Construction without Compasses 


E 2337 [1972, 180]. Proposed by A. W. Walker, Toronto, Canada 


Show how to locate eleven coplanar points on eleven straight lines, with each 
point on three lines and three points on each line, using (a) straightedge and com- 
passes; (b) straightedge only. 


I. Solution by G. B. Robinson, SUNY at New Paltz. Solution to part (b): 
Let ABCD be any quadrilateral, and let F be an arbitrary point on side AB. 
Let E= ACQNBD, H = FEQCD, L= AHQFD,J = BLQOAD,G = JE QBC, 
and K = GDC FC. These are the eleven points. When we have shown that K, E and L 
are collinear, then the eleven lines will be LEK, BLJ, ALH, CKF, DKG, GEJ, ADJ, 
BGC, FEH, AFB, and CHD. If we project (in 3-space) ABCD into a parallelogram, 
then the indicated points taken in pairs are symmetric with respect to point E, so K, 
E, and L are collinear. Observe that we have a total of 15 lines, since BED, AEC, FLD, 
and BKH are also lines. 

Solution to part (a): Draw a small circle in the upper left corner of the paper 
where it will not get in the way. Then proceed as in part (b). 


II. Remarks by the proposer. The familiar Pascal 9, and Desargues 10, con- 
figurations naturally arouse curiosity about an 11,, but the only references known 
to me appear in Encyk. der Math. Wiss., Vol. 3, Part 1-1, p. 486 (and p. 490, where 
it is noted that all real n, configurations can be constructed with straightedge and 
compasses, and many with straightedge alone); a French translation appears in 
Encyc. des Sci. Math., Tom. 3, Vol. 2, pp. 153, 158. These references give no con- 
struction details. See also Monat. fiir Math. u. Phys. 6 (1895), p. 255 where, in an 
appendix to a paper on 12, configurations, diagrams are given for all the 31 possible 
11, configurations, but with no indication of the method of construction. 


Also solved by Carolyn MacDonald, and by the proposer. 


204 ELEMENTARY PROBLEMS AND SOLUTIONS [February 
A Point on a Radical Axis 


E 2338 [1972, 180]. Proposed by A. W. Walker, Toronto, Canada 


Straight lines AP, BP, CP meet the side lines BC, CA, AB of triangle ABC at 
points D, E, F. By Euclidean construction, locate P so that it lies on the radical 
axis of circles ABC and DEF. 


Solution by the proposer. Let X, Y, Z be the points where circle DEF meets 
the lines BC, CA, AB again; then it is known [1, 2]| that the lines AX, BY, CZ con- 
cur at Q, and that the areal equation of the radical axis of circles ABC and DEF is 


XiX%2 = Vi Y2 2122 


with ABC as reference triangle and (x1, 1,21), (%2, 2,22) as areal coordinates of 
P, Q. Hence P lies on this radical axis if the isotomic conjugate of QO with coordi- 
nates (1/x,, 1/y,, 1/z,) lies on x + y+z = 0, the “‘line at infinity.”’ Thus we have 
the following construction: through A, B, C draw parallel lines meeting BC, CA, AB 
at U, V, W, and locate the reflections X, Y,Z of U, V, Win the midpoints of BC, 
CA, AB respectively; then the circle XYZ meets the lines BC, CA, AB again at 
D, E, F and the lines AD, BE, CF concur at a point P satisfying the required 
condition. 

An alternative construction for P is as follows. Take a point J on the circumcircle 
of the triangle formed by the lines through A, B,C parallel to BC, CA, ABand locate 
the reflections D, E, F of the points (AJ, BC), (BJ, CA), (CJ, AB) in the midpoints 
of BC,CA, AB respectively; then the lines AD, BE,CF concur at a point P with 
the stated property. 

The proof depends on the following two results. , 

(1) If the inscribed conic with center K touches the sides BC, CA, AB of triangle 
ABC with centroid G at D, E, F respectively, and (vectorially) GJ = —2:GK, 
then the lines AD, BE, CF concur at P, the isotomic conjugate of J (and conversely). 

(2) If the centers of two conics inscribed in a triangle T are isogonal conjugates 
in T, the six contact points of the conics with the sides of T are concyclic (and con- 
versely ). 

A proof of (2) is given in [3], and (1) is an affine-projective generalization of 
the special case [4] for which K, J, P are the incenter, Nagel point, and Gergonne 
point of triangle ABC. 

Applying (1) to the above lines AX, BY, CZ, then if the isotomic conjugate of 
their meet Q is at infinity, so is the center of the inscribed conic X YZ (a parabola). 
But the isogonal transform of the “‘line at infinity’’ is the circle ABC, so it follows 
from (2) that the center K of the inscribed conic DEF lies on circle ABC, and there- 
fore by another application of (1) the corresponding point J lies on the image of 
this circle under the homothety (G, —2), justifying the second construction. 


1973] ELEMENTARY PROBLEMS AND SOLUTIONS 205 


References 


1. A. Cayley, Messenger of Math., (2), 23 (1894) 24. 

2. Téda Ono, Téhoku Math. J., (1), 10 (1916) 31, 199. 
3. R. Deaux, Mathesis, 63 (1954) 218. 

4. R. A. Johnson, Modern Geometry (1929) 184, 225. 


Also partially solved by Lew Kowarski. One incorrect solution was received. 


A Point not on a Radical Axis 


E 2339 [1972, 180]. Proposed by A. W. Walker, Toronto, Canada 


Points D, E, F are the feet of the perpendiculars to the sides of triangle ABC 
from a point P (# A, B or C) in the plane of the triangle. Prove that P cannot lie 
on the radical axis of circles ABC and DEF. (Cf. Problem E2338.) 


Solution by the proposer. Let a, b, c; A, B, C; R, be the side lengths, angles, 
and circumradius of triangle ABC. The areal equation (with ABC as reference triangle) 
of the radical axis of circles ABC, DEF is [1] 


= bexo(cyo + bz, cos A)(bZ + cygcos A)x = 0, 


where (Xo, Yo, Zo) are areal coordinates of P. If P lies on this line, its isogonal con- 
jugate Q with areal coordinates (a?/xo, b*/yo,c7/zo) lies on the conic with areal 
equation 


> a(bz + cycos A)(cy + bzcos A) = 0 
which has the simpler alternative form 
4AR*(x + y + z)? — (a7 yz + b?zx + c?xy) = 0. 


Using a known expression [2,3] for the distance OQ from the circumcenter O to 
the point Q(x, y,z), this becomes 


4R? —(R? — 00?) =0, OQ? = —3R?, 


so that the conic locus of Q is an imaginary circle. 


References 


1. John Casey, Treatise on Analytic Geometry, ed. 2, (1893), p. 136. 
2. Jour. de Math. Elém., (3), 2, (1888), p. 102. 
3. Mathesis, 63, (1954), 120. 


206 ELEMENTARY PROBLEMS AND SOLUTIONS [February 
Doubly Stochastic Matrices 


E 2340 [1972, 180]. Proposed by Franz Hering, University of Washington 


A square matrix is doubly-stochastic if its entries are non-negative and if every 
row sum and every column sum is one. Show that every doubly-stochastic matrix 
(other than the one with all entries equal) contains a 2 x 2 submatrix 


1a " 
\. d 
such that either min(a, d) > max(b,c) or max(a,d) < min(b,c). 


Solution by S. S. Mitra, Wilkes College. Let a,, be a maximal entry of the doubly 
stochastic n x n matrix. Since the matrix does not have all entries equal, a,,> 1/n. 
Let x be the minimum of all entries that appear in either the sth row or tth column. 
Assume that x = a,,, (we can give a similar argument in case x = a,,). Since a,, > 1/n 
it follows that a,,, < 1/n; that is, a,, > a,,,. This observation together with the fact 
that the sum of the entries in both the tth and mth columns is one, allows us to 
conclude that for some i, a; < a;,,. We have a, = Aj, > Gy, 2 As,. The submatrix 


consisting of the above four elements has the desired property. 


Also solved by David Grinstein, Joel Levy, O. P. Lossers (Netherlands), Duston Stafford, The 
3-S Group of New York, R. J. Weber, and the proposer. 


A Not-So-Easy Urn Problem 


E 2341 [1972, 181]. Proposed by Harry Lass, Jet Propulsion Laboratory, Ca- 
lifornia Institute of Technology 


Given n urns numbered 1, 2,...,n and k objects. Suppose that each of the objects 
is placed at random in one of the urns. For r = 1,2,...,n let E, be the event that 
the number of objects in the first r urns does not exceed r. Find the probability 
of the joint occurrence of E,,E,,...,£, (Cf. E2252 [1971, 797].) 


I. Solution by D. M. Bloom, Brooklyn College. Let f(n,k) be the number of 
ways of putting k objects into n urns so that the events E,,...,E, occur jointly. 
Then for n 2 1, 


k 
k ; 
(* fink = & (7 )fo-1,0, 
this equation holding for 0 < k < n. Suppose there are i objects in the first n — 1 
urns. There are (}) ways to choose these i objects, and given this there are f(n—1, i) 


ways for them to be distributed in the first n — 1 urns so that E,, E,,...,E,—, occur 


1973] ELEMENTARY PROBLEMS AND SOLUTIONS 207 


jointly. The other k—i objects go into the nth urn and the event E, then occurs 
automatically. Using (*) and a straightforward induction on n, one can then establish 


that 
f(n,k) = (n+ 1) *(n+1-—4) 


if O<k<n while f(n,k)=0 if k>n. The desired probability is then 
n~“f(n, k) ° 


II. Solution by P. J. Burke, Bell Telephone Laboratories. If k >n, the prob- 
ability is obviously zero, so that we can assume that k < n. The procedure described 
is equivalent to the one in which the k objects are placed in n + 1 urns, under the 
condition that the first urn is left empty. Using this equivalent procedure with 
n+ 1 urns, let S, be the event that the number of objects in the first r urns is less 
than r; the required probability is equal to that of the joint event S$; NS, 9... AS,41 
given S,. But the probability of S, is n“/(n + 1)* and the probability of 
S, 08, 0...ASi41 is (1n+1—k)/(n+1) by Theorem 1, p. 10 of L. Takacs, 
Combinatorial Methods in the Theory of Stochastic Processes. Hence the required 
probability is (n+1—A)(n+ 1)" "In". 


Also solved by R. J. Weber, and the proposer. 


Editor’s comment. If k is taken as a fixed percentage of n, say k = an, then 
the probability approaches (1 — a)e* as n > oo. 


Mahler’s Second Congruence, Resurrected 


E 2342 [1972, 181]. Proposed (independently) by Joe Buhler, Reed College, 
and by M. B. Nathanson, University of Rochester 


If k and n are positive integers, what is the highest power of 2 that divides 
k” — 12 In particular, for a fixed k, find all values of n for which k" = 1 (mod 2"). 


Solution by Wells Johnson, Bowdoin College. Let f(k,n) be the highest power 
of 2 that divides k" — 1 and let g(k, n) be the highest power of 2 that divides k"+ 1. 
If k is even, then obviously f(k,n) = 0 so we assume that k is odd in what follows. 
Write k —1 = 2% and k+1 = 2’v, where r,s 2 1 and uw and v are odd. Then 
k™ = (1 + 2'u)" = 1+ 2’nu + 2?'w for some integer w (possibly 0 if n = 1). If n is 
odd, it follows that f(k,n) =r, and hence k” = 1 (mod 2”) for all odd n which do 
not exceed r. A similar argument shows that g(k,n) = s when n is odd. 

To attack the case of even n, we note first that since k7" — 1 = (k" — 1)(k" + 1), 
we have the general formula f(k,2n) = f(k,n) + g(k,n). Also, if f(k,n) 2 2, then 
k™ = 1(mod4) so that g(k,n) = 1 and hence in this case f(k,2n) = f(k,n) +1. 
Suppose now that n = 2'm where m is odd and t 21. Then f(k,2m) = f(k,m) 


208 ADVANCED PROBLEMS AND SOLUTIONS [February 


+ g(k,m) = r+s = 250 that by an easy induction it follows that f(k, n) = f(k,2°m) 
= f(k,2°~'m)+1=r+s+t—1. This solves the first part of the problem. 
Let k be a fixed odd integer. Then every odd n which does not exceed r is a so- 
lution of the congruence k” = 1 (mod 2"); if n is even, then n solves the congruence 
if and only if r+s+t—12n, or equivalently if and only if n—tSr+s-—1. 
It is easy to see that given k (and thus r and s) there are only finitely many choices 
for t and m and thus for n. For example, if k = 247 — 1 = 4095, we see that r = 1. 
and s = 12 and thus 4095" = 1 (mod 2") only for n = 1, 2, 4, 6, 8, 10, 12 and 16. 
On the other hand, if k = 4097 so that r = 12 and s = 1, then the congruence 
4097" = 1 (mod 2") has the solutions 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, and 16. 


Also solved by Joe Albree, The Bennett College Team ,M. T. Bird, D. M. Bloom, M.S. Demos, 
O. H. Fraser, John Goth, M. G. Greening (Australia), Lew Kowarski, J. R. Kuttler, O. P. Lossers 
(Netherlands), Carolyn MacDonald, Grattan Murphy, Michael Shimshoni (Israel), Edith V. Sloan, 
D. P. Sumner, S. J. Tillman, R. J. Weber, J. R. Weiss, Brian Wesselink, Charles Wexler, and the 
proposers. 


Editor’s comment. Nathanson comments that this problem is a special case of a theorem proved 
by him in An exponential congruence of Mahler, this MONTHLY, 79 (1972) 55-57. At the time he 
submitted the paper, he had forgotten about his previous problem, which has, all told, led a very 
hard life. 


ADVANCED PROBLEMS 


All solutions of Advanced Problems should be sent to J. Barlaz, Rutgers —The State University, 
New Brunswick, N. J. 08903. Solutions of Advanced Problems in this issue should be typed (with 
double spacing) on separate, signed sheets and should be mailed before May 31, 1973. Contribu- 
tors (in the United States) who desire acknowledgment of receipt of their solutions are asked 
to enclose self-addressed, stamped postcards. 


Anasterisk (*) means neither the proposer nor the editors supplied a solution. 


5895. Proposed by Frank Bernhart, Kansas State University 


For n = 1 distinguish n points in the interior of a plane circle C and n + 2 points 
on the circumference. The 2n + 2 points are to be connected in pairs by 2n + 1 
noncrossing arcs within C so that (a) each point on the circumference is the endpoint 
of one arc, (b) each interior point is the endpoint of three arcs, and (c) each pair of 
endpoints on the circumference is connected by a path. In graph theory terms, a 
cubic tree is inscribed in a circle. Case n = 1 is illustrated by three radii. If the points 
on the circumference are labeled in cyclic order x,,X ,°+: , X,4 2, put m; = the number 
of interior points on the unique path between x; and x;,4 (X,43 = 4). 

1. Find a recursive definition for the set of possible sequences 


M = (m4, Mp, ++-, M42). 


2. Start with n = 1 and increase the tree by successive steps each consisting of 


1973] ADVANCED PROBLEMS AND SOLUTIONS 209 


randomly selecting a point x;, moving it inside C, and joining it to the circumference 
by two new arcs. Show that for a fixed integer k 2 1, its fraction of occurrences in 
the sequences obtained by such constructing is asymptotic to 2*~1/3* as n — oo. 

3. Find a nonrecursive test for determining if a sequence M is a possible sequence. 


5896. Proposed by A. W. Schurle, University of North Carolina at Charlotte 


It is easy to show that the metric space (X, d) is complete if it is uniformly locally 
compact, i.e., if there is a positive ¢ such that {y: d|x, y| < e} is compact for all x. 
Is the converse true for the real line, i.e., is every complete metric that yields the usual 
topology on the line uniformly locally compact? 


5897*. Proposed by I. J. Good, Virginia Polytechnic Institute and State 
University 


Prove that if z is real or complex and is not zero, then 


oz 74 F Ft 
7 z—-1+1+1 4+ 3z-1+1+1 + 5z-1+4+ 


= [1,(2n + 1)z — 1,1 ],., (in a standard self-explanatory notation) 
[1, (6n + 1)z — 4, (24n + 12)z, (6n + 5)z —4],% 0. 
5898*, Proposed by Sylvester Reese, Baruch College, New York City 


| 
| 


Is the set of zeros of all entire functions with rational coefficients (for their 
Maclaurin series) the field of complex numbers? 


5899. Proposed by Joel Spencer, University of California, Los Angeles 


Professor Sédre is, once again, unprepared for his Epsilondeltopology class. He 
has prepared the first half of his lecture in which he proves a certain n propositions 
P,,-:-;P, equivalent. He had planned the most efficient proof, by showing 
P, > P,>-: > P,=>P, (The theorems P,=> P, take an equal amount of time to 
prove.) Then he notices he may essentially double the length of his proof (from n to 
2n — 2) by showing P, > P,<---<>P,,. This method of proof is irredundant, that is, 
if any implication is deleted we may not deduce that P,,---, P,, are equivalent. Prove 
that this is the longest (in terms of number of implications) irredundant method of 
proof. 


SOLUTIONS OF ADVANCED PROBLEMS 
The Distribution of Lebesgue-measurable Sets 


5821 [1971, 1027]. Proposed by Eric Langford, University of Maine 
Let I denote the unit interval [0,1]. (a) Suppose that E is a Lebesgue-measurable 
subset of J such that 0 < m(E) < 1. Show that 


210 ADVANCED PROBLEMS AND SOLUTIONS [February 


MENS) _ - 
mi) = 0, (ii) sup, 


where the supremum and infimum are taken over the class of all nontrivial, proper 
subintervals J of I. 

(b) Does there exist a set E such that for every J, 0< m(EQ J) < m(VJ)? Le., 
does there exist a set E which meets every nontrivial interval in a set of positive 
measure, and whose complement I\ E does likewise? 


m(E OJ) _ 


(i) inf, ma) 


Solution by J. W. Grossman, Massachusetts Institute of Technology. 
(a) (ii). Suppose not. Then there is 6 > 0 such that m(E NJ) < (1 — 6)m(VJ) for all 
J. Let 0 < e< 6(1 — 6)~1m(E) and choose an open set U containing E such that 
m(U) < m(E) + é (by definition of measure). Write U as the disjoint union of intervals 
J; We then have 


m(E) +e>m(U) = Um(VJ;) > — UME OJ;) = —m(E) 


or e > 0(1 — 6)~‘!m(E), contrary to the choice of e. 

(i) follows by applying (ii) to the complement of E. 

(b) Yes. Fix 0<a<41. By a generalized Cantor set on an interval J we mean 
the set which remains after removing at the ith stage the middle ‘‘third’”’ of length 
am(J)3~‘ from each interval remaining at that stage. The measure of a generalized 
Cantor set is (1 — «)m(J), and its complement is dense in J. (See Royden, Real 
Analysis, p. 63.) Now write I = (Cy UC, UC, U---) UL, where Cy is the Cantor 
set; for n 21, C, is the (countable) union of generalized Cantor sets, one on each 
interval of I — (Cy UC, U-+- UC,_,); and L is what’s left. 

Note that for n = 1 we have m(C,) = «"—*(1 — «). Let E = LUC, UC, UC3U>, 
so that! -E=C,UC,UC,U---. Let an interval J be given. Then some interval 
J’ of I— (Cg UC, UC, U-+» U Cy) is contained in J for some large N. But then 
mJ OE)2 (A —«)m(J’)>0, while mJ OU — E)) = 4a(1 — «)m(J’) > 0, ~which 
implies m(J OE) < m(J), as desired. 


Also solved by Michel Bousquet, Max Broberg (Sweden), R. A. Christiansen, C. V. Coffman, 
R. O. Davies (England), Henry Fast, Neal Felsinger, G. J. Foschini, Barbara A. Keller, Douglas Lind,. 
S. S. Mitra, Rollin Sandberg, R. M. Warten, A. Wilansky, and the proposer. 


Editorial Note. Several solvers note that (a) is an immediate consequence of the Lebesgue density 
theorem while (b) appears as an exercise in several texts, e.g., Natanson, Theory of Functions of a 
Real Variable, p. 88. Wilansky offers other references for part (b): A. Settari (Math. Rev. 1968. 7 
5892), C. V. Coffman (this MoNTHLY 1965, p. 941). Coffman cites a theorem of J. J. Shaffer which 
depends on a category argument: Let B be the class of Borel sets in [0, 1], and let ({0, 1], B, v) be a 
finite measure space such that v(J) > 0 for every interval J < [0,1]. Then there exists a Borel 
set D such that v(J (1 D) > O and (v(J\ D) > 0 for every interval J < [0. 1]. 


1973] ADVANCED PROBLEMS AND SOLUTIONS 211 


Number of Intersections of a Secant with its Curve 


5822 [1971, 1027]. Proposed by Joseph Malkevitch, York College, New York 
City 


Does there exist a planar simple closed curve K such that every line through 
every point in the interior of K meets the boundary of K in precisely 2r points (r an 
integer = 2)? 


Solution by H. Guggenheimer, Polytechnic Institute of Brooklyn. The fol- 
lowing theorem implies that the answer is in the negative. 


THEOREM. A Jordan curve of finite order (i.e., there is a finite maximal number 
of intersections of a line and the curve) has a secant that meets the curve in exactly 
two points. 


Proof. Choose any interior point P and a point X on the curve at maximal 
distance from P. The curve is contained in the closed disk of radius PX. Choose a 
secant of the circle parallel to the tangent at X and at distance ¢ from that tangent. 
The curve then intersects the circular segment bounded by the secant and the circular 
arc containing X in a finite number of continua. For any continuum not containing 
X the minimal distance from the tangent is positive. Hence, if we take e smaller than 
the minimum of a finite number of positive quantities we arrive at a secant parallel to 
the tangent that intersects only the continuum containing X. 

Since any ray issued from P intersects the curve in a finite number of points, the 
curve must intersect PX transversally and is starshaped for P in a neighborhood of X. 
X cannot be the interior point of a straight segment on the curve. A repetition of the 
previous argument also shows that the radius vector from P can have only a finite 
number of relative maxima in a neighborhood of X. Hence, the curve is locally convex 
near X and some secant (in fact, all secants close enough to the tangent) intersects 
the curve only in two points. 


212 ADVANCED PROBLEMS AND SOLUTIONS [February 


REMARKS. (1) The negative result also follows from Otto Haupt and Hermann 
Kiinneth, Geometrische Ordnungen, Springer 1967, 2. Satz, Sec. 1.5. That theorem 
is of extraordinary generality. The theorem proved can be obtained without effort 
not only for straight lines but for general order characteristics from Behauptung II, 
1. Satz, same reference. 

(2) The adjoining figure gives a curve that admits one point such that all lines 
through that point intersect the curve in 4 points. Query: Does there exist a curve 
with two such points? 

(3) Some introductions to the geometry of plane curves are: C. Juel, Einleitung 
in die Theorie der ebenen Elementarkurven dritter and vierter Ordnung, Kgl. 
Danske Vidensk. Selsk. Skr., (7) 11, (1914) 113-167. 

L. Locher-Ernst, Einfiihrung in die freie Geometrie ebener Kurven, Birkhauser, 
Basel 1952. 

A. Marchaud, Sur les continus d’ordre borné, Acta Math., 55 (1930) 67-115. 


Also solved by L. E. Mattics, and by Max Onoberg. 


Domains with Linearly Ordered Primary Ideals 


5823 [1971, 1028]. Proposed by J. T. Arnold, Virginia Polytechnic Institute, 
and J. W. Brewer, University of Kansas 


Let D be an integral domain with identity. Show that if the primary ideals of D 
are linearly ordered under © and if D satisfies the ascending chain condition on prime 
ideals, then D is a valuation ring. 


Solution by Seren Jondrup, University of Copenhagen, Denmark. Proof is to 
be by induction on n, the number of prime ideals in D. We shall use two lemmas: 


Lemma 1. If the commutative ring R has only one prime ideal, then 0 is a primary 
ideal. 

We observe that the set of zero divisors is a union of prime ideals and that the 
set of nilpotent elements is an intersection of prime ideals. Thus, with only one 
prime ideal, every zero divisor is nilpotent. 


LEMMA 2. If a commutative ring R has linearly ordered primary ideals, then for 
every prime ideal P in R, the ring Rp has the same property. 


I-—TIp is a one-to-one order preserving correspondence between the primary 
ideals of R contained in P and the primary ideals of Rp. 

Let n =1. Let P be the only prime ideal of D. Let I be a proper ideal of R. If 
I <P, then R/I has just one prime ideal, P/J. By Lemma 1, J is primary. If J ¢ P, 
then R/I has no prime ideals, so every element of R/I is nilpotent and again J is 


1973] ADVANCED PROBLEMS AND SOLUTIONS 213 


primary. Since the primary ideals are linearly ordered, for any x,yeéD we have 
xDCyD or yDE xD. 

Now let D be an integral domain with exactly n prime ideals and with linearly 
ordered primary ideals. Let the prime ideals of D be P,,---,P, and let P;S P;,, 
for all j. If I is a proper ideal not contained in P,,_,, then J is primary by Lemma 1. 
Since the primary ideals in D are linearly ordered, I > P,,_,. So, for any two elements 
x, y not both in P,_,, we have xD © yD or yD © xD. Therefore we have to prove 
that for x,yeEP,-, it is true that xD ¢ yD or yD € xD. If we localize at P,_,, then 
by the induction hypothesis and Lemma 2, 


XDpy-1 S YDpy—-1 OF YDpy—1 S XDpy-1- 


Let us assume that the first inclusion holds, thus we can find r, te D, té P,,_, such 
that tx =ry. We know that rD <tD or tD €rD, and the result is proved. 


Also solved by John Coolidge and by the proposer. 


A Gamma Function Limit 


5824 [1971, 1028]. Proposed by N.F. Neuts, Purdue University 


Show that for every finite complex number u, 


_ 2,,2 
lim exp(— uy,/n) - I" (1 — “A = exp (Ss) 
n 


n> + 0 12 ° 


where I(- ) is the gamma function and y is Euler’s constant. 


Solution by R. G. Buschman, University of Wyoming. For n> |u|? let 
z= —u/,/n in the formula 


00 


logM(l+z)+7z = X (-1)"&(m)z” /m, 
m=2 


| Higher Transcendental Functions, A. Erdélyi, et al, formula 1.17 (2)]. If we multiply 
each side of the equation by n and take exponentials we have 


exp( — pu,/n) "(1 — u/,/n) = exp(u7E(2) + f(n)). 


Since f(n) +0 as n> 00 and ¢(2) = n7/6, the result follows. 


Also solved by J. A. Boa, D. Borwein, Robert Breusch, Paul Bugl, M. L. Glasser, A. A. Jagers 
(Netherlands), Vaclav Konetny, O. P. Lossers (Netherlands), M. H. Moore, C. C. Rousseau, O. G. 
Ruehr, David Shelupsky, P. H. Young, and the proposer. 

Shelupsky states the limit in the following form: If f(z) = 0, f(z) € C3 in a neighborhood of 
Zo, then 


u./nf (Zo) + nf(Zp — u |,/n) + u*f"(Zo) /2. 


214 REVIEWS [February 
Roots of a Minimal Polynomial 


5825 [1971, 1028]. Proposed by Erwin Just, Bronx Community College 

Assume that « and f are real numbers, B + 0, such that « + fi is a zero of f(x), a 
cubic polynomial with rational coefficients. If g(x) is the minimal polynomial (with 
rational coefficients) of Bi, can any of the zeros of g(x) be real? 


Solution by Irving Gerst, State University of New York at Stony Brook. 
The zeros of f(x) are y, = a + Bi, y. = « — Bi and a real zero y;. Then 


3 
h(x) = |] [x-@- 7/2] 
oe 
has Bi as a zero, has no real zeros, and, by the symmetric function theorem, has 


rational coefficients. Since g(x) is a divisor of h(x), no zero of g(x) can be real. 


Also solved by Robert Breusch, R. O. Davies (England), G. J. Foschini, Anne Grams & Tom 
Parker, P. R. Hafner (New Zealand), A. A. Jagers (Netherlands), L. E. Mattics, P. L. Montgomery, 
P. J. Owens (England), Nicholas Passell, Stephen Pierce, Wesley Tom, Elizabeth Yip, and the 


proposer. 
Montgomery offers the polynomial x4 — 2 to show that some restriction on the degree of f(x) 


iS necessary. 


REVIEWS 


EDITED BY J. ARTHUR SEEBACH, JR. AND LYNN A. STEEN 


with the assistance of the mathematics departments of St. Olaf and Carleton Colleges 
COLLABORATING EDITOR FOR FILMS: SEYMOUR SCHUSTER, CARLETON COLLEGE 


Printed materials for reviews should be sent to: Book Review Editor, American Mathematical 
Monthly, St. Olaf College, Northfield, MN 55057. Films and correspondence relating to films 
should be sent to Seymour Schuster, Carleton College, Northfield MN 55057. 

All unsigned material is written by the editors. A boldface capital C in the margin indicates 
that a review is based in part on classroom use. Professors willing to write such a review should 
inform the editor in order to avoid duplication. 


Probability and Mathematical Statistics: An Introduction. By Eugene Lukacs. Acade- 
mic Press, New York, 1972. x+242 pp. $8.50. (Telegraphic Review, April 1972.) 

Introductory Statistics and Probability: A Basis for Decision Making. By David W. 
Blakeslee, William G. Chinn. Houghton Mifflin, Boston, Massachusetts, 1971. 
ix + 356 pp. $7.95. (Telegraphic Review, April 1972.) 


THE AMERICAN 


MATHEMATICAL MONTHLY 


(FOUNDED IN 1894 By BENJAMIN F. FINKEL) 
THE OFFICIAL JOURNAL OF 


THE MATHEMATICAL ASSOCIATION OF AMERICA 


NUMBER 3 


233 
270 
276 
281 


282 
286 


VOLUME 80 
CODEN: AMMYAE 
CONTENTS 
Hilbert’s Tenth Problem is Unsolvable . . . . . . . = . ~ . MartTIN Davis 
History of the Riemann Mapping Theorem . ... . . . .J. L. WALSH 
The First U.S.A. Mathematical Olympiad. . . . . . . +. . S. GRFITZER 
Correction to ““What is a Reciprocity Law?” . . . . . . . B. F. WyMan 
MATHEMATICAL NOTES 

On Elementary Proofs of Peano’s Existence Theorems . . . . JOHANN WALTER 
A Remark Concerning Absolutely Continuous Functions. . . . F.S. VAN VLECK 
On Non-Associative Algebras Derived from Graphs . . . . . .W.E. JENNER 


A Finite Difference Proof that E= mc? . . . . . .  .DONALD GREENSPAN 


RESEARCH PROBLEMS 
Reachability Problems in Vector Addition Systems . . . . . . 3B. O. NAsH 
When do All A-Sequences Modulo m have Period One?. 
E. A. PARBERRY AND NANCY GRauDoNs 


CLASSROOM NOTES 
On Injective Modules . ... . . .  AZMI HANNA 
The Hamel Dimension of any Infinite Dimensional ‘Separable Banach Space is c . 
Ck H. E_Ton LACEY 
A Note on “Conformality Lo . .  R. K. WILLIAMS 
A Wronskian Condition Related to Ordinary Differential Equations . 
L. C. EGGAN AND A. I, INSEL 


MATHEMATICAL EDUCATION 
Teaching Applicable Mathematics. . . . . . . . +. +. +. +E. A. BENDER 


(Continued on inside cover) 


MARCH 


288 


289 


292 


295 


297 


298 
299 


300 


302 


1973 


Individualized Instruction in Large Enrollment Mathematics Courses . BERT WAITs 307 


Female Mathematicians, Where are You? . . 
CUPM Report to the Board of Governors, August 1972 
ELEMENTARY PROBLEMS AND SOLUTIONS 
ADVANCED PROBLEMS AND SOLUTIONS . 
REVIEWS . 
NEWS AND Notices ; ; 
MATHEMATICAL ASSOCIATION OF “AMERICA . 
Committee on Educational Media. 


VIOLET H. LARNEyY 310 
313 

315 

. 324 

. 330 

346 

347 

347 


Proceedings of the 1971 Summer Conference held at the University of Missouri, Rolla. 347 


Calendars of Future Meetings . 


348 


NOTICE TO AUTHORS 


Specialized research is usually unsuitable; see Statement of Policy (vol. 76, p.2). Manuscript preparation: Please 
use the Manual for Monthly Authors (vol. 78, p. 1) and follow the format in current issues of the MONTHLY. 
Manuscripts should be typewritten, triple-spaced with wide margins; submit two copies and keep one for 


protection against loss. 


Backlog: Main Articles 12 months, Math. Notes 13 months, Research Problems 7 months, Classroom Notes 


10 months, Math. Education 10 months. 


EDITORIAL CORRESPONDENCE AND MAIN ARTICLES: to ALEX ROSENBERG, Department of 


Mathematics, Cornell University, Ithaca, N.Y. 14850; NOTES, etc.: 


to the corresponding Associate 


Editor; ADVERTISING CORRESPONDENCE: to RAouL HAILPERN, Mathematical Association of 
America, SUNY at Buffalo, Buffalo, N. Y. 14214; CHANGE OF ADDRESS and SUBSCRIPTIONS: to 
A. B. WILLcCox, Mathematical Association of America, 1225 Connecticut Ave., N.W., Washington, D. C. 


20036. 


HARLEY FLANDERS, Editor 
ALEX ROSENBERG, Editor-Elect 
ASSOCIATE EDITORS 


JOSHUA BARLAZ J. G. HARVEY 

E, R. BERLEKAMP ERIC S. LANGFORD 
JANE W. DI PAOLA P. D. LAX 

ROBERT GILMER ARTHUR MATTUCK 
RICHARD GUY M. W. POWNALL 
RAOUL HAILPERN GIAN-CARLO ROTA 


SEYMOUR SCHUSTER 

J. ARTHUR SEEBACH, Jr. 
E. P. STARKE 

LYNN A. STEEN 

JAMES WENDEL 


Annual dues for members of the Association (including a subscription to the American 
Mathematical Monthly) are $12.50. For nonmembers the subscription price is $18.00. 


PUBLISHED BY THE ASSOCIATION at Washington, D. C., and Menasha, Wisconsin, during the months of January, 
February, March, April, May, June-July, August-September, October, November, December. 


Second-class postage paid at Washington, D. C., and additional mailing offices. 


Copyright © The Mathematical Association of America (Incorporated), 1973 


PRINTED IN THE UNITED STATES OF AMERICA 


HILBERT’S TENTH PROBLEM IS UNSOLVABLE 
MARTIN DAVIS, Courant Institute of Mathematical Science 


When a long outstanding problem is finally solved, every mathematician would 
like to share in the pleasure of discovery by following for himself what has been 
done. But too often he is stymied by the abstruseness of so much of contemporary 
mathematics. The recent negative solution to Hilbert’s tenth problem given by 
Matiyasevié (cf. [23], [24]) is a happy counterexample. In this article, a complete 
account of this solution is given; the only knowledge a reader needs to follow the 
argument is a little number theory: specifically basic information about divisibility 
of positive integers and linear congruences. (The material in Chapter 1 and the 
first three sections of Chapter 2 of [25] more than suffices.) 

Hilbert’s tenth problem is to give a computing algorithm which will tell of a 
given polynomial Diophantine equation with integer coefficients whether or not it 
has a solutioninintegers. Matiyasevié proved that there is no such algorithm. 

Hilbert’s tenth problem is the tenth in the famous list which Hilbert gave in his 
1900 address before the International Congress of Mathematicians (cf. [18]). The 
way in which the problem has been resolved is very much in the spirit of Hilbert’s 
address in which he spoke of the conviction among mathematicians ‘“‘that every 
definite mathematical problem must necessarily be susceptible of a precise settlement, 
either in the form.of an actual answer to the question asked, or by the proof of the 
impossibility of its solution ...”’ (italics added). Concerning such impossibility proofs 
Hilbert commented: 

*‘Sometimes it happens that we seek the solution under unsatisfied hypotheses 
or in an inappropriate sense and are therefore unable to reach our goal. Then the 
task arises of proving the impossibility of solving the problem under the given 
hypotheses and in the sense required. Such impossibility proofs were already given 
by the ancients, in showing, e.g., that the hypotenuse of an isosceles right triangle 
has an irrational ratio to its leg. In modern mathematics the question of the impos- 
sibility of certain solutions has played a key role, so that we have acquired the 
knowledge that such old and difficult problems as to prove the parallel axiom, to 
square the circle, or to solve equations of the fifth degree in radicals have no solution 
in the originally intended sense, but nevertheless have been solved in a precise and 
completely satisfactory way.”’ 


Martin Davis received his Princeton Ph. D. under Alonzo Church. He has held positions at Univ. 
of Illinois, IAS, Univ. of Calif.-Davis, Ohio State Univ., Rensselaer Poly, Yeshiva Univ. and New 
York Univ., and he spent a leave at Westfield College, London. He has done research in various 
aspects of the foundations of mathematics, and is the author of Computability and Unsolvability 
(McGraw-Hill, 1958), The Undecidable (editor, Raven Press, 1965), Lectures on Modern Mathematics 
(Gordon and Breach, 1967), and First Course in Functional Analysis (Gordon and Breach, 1967). 
Editor. 


233 


234 MARTIN DAVIS [March 


Matiyasevic’s negative solution of Hilbert’s tenth problem is of just this character. 
It is not a solution in Hilbert’s “originally intended sense’’ but rather a ‘“‘precise and 
completely satisfactory’’ proof that no such solution is possible. The methods needed 
to make it possible to prove the non-existence of algorithms had not been developed 
in 1900. These methods are part of the theory of recursive (or computable) functions, 
developed by logicians much later ([ 6] is an exposition of recursive function theory). 
In this article no previous knowledge of recursive function theory is assumed. The 
little that is needed is developed in the article itself. 

What will be proved in the body of this article is that no algorithm exists for 
testing a polynomial with integer coefficients to determine whether or not it has 
positive integer solutions (Hilbert inquired about arbitrary integer solutions). But 
then it will follow at once that there can be no algorithm for integer solutions 
either. For one could test the equation 


P(X15°*'5 Xn) = 0 
for possession of positive solutions <x,,-°-,x,» by testing 
Pd t+ prtagtrrtspe t+ Pa tay +t, +5,) =0 


for possession of integer solutions (py, 151 15S815°°'s Pao Anon» Sn» Lhis is because (by 
a well-known theorem of Lagrange) every non-negative integer is the sum of four 
squares. (Just this once the stated prerequisite is exceeded! Cf. [17], p. 302.) In the 
body of this article, only positive integers will be dealt with— except when the 
contrary is explicitly stated. 

When Matiyasevié announced his beautiful and ingenious solution in January 
1970, it had been known for a decade that the unsolvability of Hilbert’s tenth problem 
would follow if one could construct a Diophantine equation whose solutions were 
such that one of its components grew roughly exponentially with another of its 
components. (In §9, this is explained more precisely.) Matiyasevi¢é showed how the 
Fibonacci numbers could be used to construct such an equation. In this article the 
historical development of the subject will not be followed; the aim has rather been to 
give as smooth and straightforward an account of the main results as seems currently 
feasible. A brief appendix gives the history. 


1. Diophantine Sets. In this article the usual problem of Diophantine equations 
will be inverted. Instead of being given an equation and seeking its solutions, one 
will begin with the set of ““solutions”’ and seek a corresponding Diophantine equation. 
More precisely: 


DEFINITION. A set S of ordered n-tuples of positive integers is called Diophantine 
if there is a polynomial P(x,, +++,Xn,V1s°*'s Ym), Where m = 0, with integer coefficients 
such that a given -n-tuple <x,,-°:,x,)» belongs to S if and only if there exist positive 
integers y,,°°-, Ym for which 


1973] HILBERT’S TENTH PROBLEM IS UNSOLVABLE 235 


P(X 4551s Xo Vis ***s Vn) = 0. 


Borrowing from logic the symbols ‘‘3”’ for “‘there exists’’ and ‘‘<>’’ for “‘if and 
only if’’, the relation between the set S and the polynomial P can be written succinctly 
as: 


KX ists Xn > ES (AV 50s Vin) [P(X 15°°'s Xno V19 °°» Vn) = 0], 


or equivalently: 


S= {(X15°°s Xn>| (A Vi5°'s Vn) [P(% 15°05 XnsVi0°'s Vn) = O}}. 


Note that P may (and in non-trivial cases always will) have negative coefficients. 
The word “‘polynomial’’ should always be so construed in the article except where 
the contrary is explicitly stated. Also all numbers in this article are positive integers 
unless the contrary is stated. 

The main question which will be discussed (and settled) in this article is: 

Which sets are Diophantine? A vague paraphrase of the eventual answer is: any 
set which could possibly be Diophantine is Diophantine. What does the phrase 
‘‘which could possibly be Diophantine’’ mean? And howis all this related to Hilbert’s 
tenth problem? These quite reasonable questions will only be answered much later. 
In the meantime, the task will be developing techniques for showing that various sets 
are indeed Diophantine. 

A few very simple examples: 

(i) the numbers which are not powers of 2: 


xeS<>(dy,z)[x = yQz + 1)], 
(ii) the composite numbers: 
xeS(dy,z)[x=Q+)DCG+))], 


(iii) the ordering relation on the positive integers; that is the sets {<x, y>| x<y}, 


{<x, y>| x Sy}: 
x << y(dz)(x+z=)y), 


xSy(dz)(~¥+z-1=y), 
(iv) the divisibility relation; that is {<x, y>| x| y}: 
x| y<>(dz) (xz = y). 


Examples (i) and (ii) suggest, as other sets to consider, the set of powers of 2 and 
of primes respectively. As we shall eventually see, these sets are Diophantine; but the 
proof is not at all easy. 

Another example: 

(v) the set W of <x, y,z> for which x| y and x <z: Here 


x| y<> (du) (y = xu) and x < z(4v) (z =x +d). 


236 MARTIN DAVIS [March 


Hence, 
<x, y,Z> € W <> (du, v) [(y — xu)? + (z —x — 0)? = 0]. 


Note that the technique just used is perfectly general. So, in defining a Diophantine 
set one may use a simultaneous system P, =0, P, =0,---,P, =0 of polynomial 
equations since this system can be replaced by the equivalent single equation: 


PL+Py+++ +P, =0. 


By a “‘function”’ a positive integer valued function of one or more positive integer 
arguments will always be understood. 


DEFINITION. A function f of n arguments is called Diophantine if 
{<x4, Xa Y>| y = f (x1, X,)} 


is a Diophantine set, (i.e., f is Diophantine if its ‘‘graph’’ is Diophantine). 
Another question that will be answered here is: which functions are Diophantine? 
An important Diophantine function is associated with the triangular numbers, 
that is numbers of the form: 


n(n + 1) 


Tin) =14+2+°-4+n= 5 


Since T(n) is an increasing function, for each positive integer z, there is a unique 
n = 0 such that , 


Tn)<zSTHnt+1)=TH)+n+1. 
Hence each z is uniquely representable as: 
z=T(n)+y; ysSntl, 
or equivalently, uniquely representable as: 
z=T(x+ty—2)+y. 
‘In this case, one writes x = L(z), y = R(z); also one sets 
Pix, y)=Tix+y—2)+y-1. 


Note that L(z), R(z) and P(x, y) are Diophantine functions since 


z = P(x,y) > 2z=(x+y—2)(x+y—1)4+2y 
x = Liz) + (y)[22=@+y—-—2)(*+y—-1) +29] 
y = Riz) @ (x) [22 =(%+y—-2) (e+ y—1)4+2y]. 


The function P(x; y) maps the set of ordered pairs of positive integers one-one 


1973] HILBERT’S TENTH PROBLEM IS UNSOLVABLE 237 


onto the set of positive integers. And, for each z, the ordered pair which is mapped into 
z by P(x, y) is (L(z), R(z)). ('P”’ is for “‘pair’’, “‘L’’ for “‘left’’, and ‘‘R”’ for ‘‘right’’.) 
Note also that L(z) S$ z, R(z) S$ z. To summarize: 


THEOREM 1.1 (Pairing Function Theorem'). There are Diophantine functions 
P(x, y), L(z), R(z) such that 

(1) for all x, y, L(P(x, y)) = x, R(P(x, y)) = y, and 

(2) for all z, P(L(z), R(z)) = z, L(z) Sz, R(z) Sz. 

Another useful Diophantine function is related to the Chinese Remainder Theorem, 
stated below: 


DEFINITION. The numbers m,,-::,my arecalled an admissible sequence of moduli 
if i A j implies that m; and m, are relatively prime. 


THEOREM 1.2 (Chinese Remainder Theorem). Let a,,---,ay be any positive 
integers and let m,---,my be an admissible sequence of moduli. Then there is an x 
such that: 


x =a, modm, 
x =a, mod m, 
xX = dy mod my. 


The Chinese remainder theorem is proved for example in [25], p. 33. (That x can 
be assumed positive is not ordinarily stated. But since the product of the moduli 
added to a solution gives another solution, this is obvious.) 

Now let the function S(i,u) be defined as follows: 


S(i,u) = w, 
where w is the unique positive integer for which: 
w = L(u) mod 1+iR(u) 
ws1+i R(u). 
Here w is simply the least positive remainder when L(u) is divided by 1 + i R(u). 


THEOREM 1.3 (Sequence Number Theorem). There is a Diophantine function 
S(i,u) such that 

(1) SGi,u) Su, and 

(2) for each sequence ay, °::,ay, there is a number u such that 


S(Gi,u) =a; for 1SidN. 


Proof. The first task is to show that S(i, u) as defined just above, is a Diophantine 


238 MARTIN DAVIS [March 


function. The claim is that w = S(i,u) if and only if the following system of equations 
has a solution: 


2u 


(x+ty—2)(x+y—1)4+2y 


x = w+2z(1 +iy) 


l+iy=w+ov-l. 


This is because (by the discussion leading to the Pairing Function Theorem), the 
first equation is equivalent to: 


x = L(u) and y = R(u). 


Then (using a technique already noted) one needs only sum the squares of the three 
equations to see that S(i,u) is Diophantine. 

Now S(i,u) S$ L(u) S u. So finally, let a,,---,a) be given numbers. Choose y to 
be some number greater than each of a,, ---,ay and divisible by each of 1,2, ---, N. 
Then the numbers 1 + y, 1 + 2y,---,1 + Ny are an admissible sequence of moduli. 
(For, if dj1+iy and d[1+jy, i<j, then d/[j(l +iy) —i1 +/y)], ie, d|j—i 
so that d < N; but this is impossible unless d = 1 because d | y.) This being the case, 
the Chinese Remainder Theorem can be applied to obtain a number x such that 


x a, modi+y 


{ll 


x a, mod 1+2y 
x = ay mod 1+Ny. 
Let u = P(x, y), so that x = L(u) and y = R(u). Then, for i = 1,2,---,N 


a,=L(u) mod 1+ iR(u) 


and a;< y = R(u) <1+iR(u). But then by definition, a; = S(i,u). 
A striking characterization of Diophantine sets of positive integers (cf. [26]) is 
given by: 


THEOREM 1.4. A set S of positive integers is Diophantine if and only if there 
isa polynomial P such that S is precisely the set of positive integers in the rangeof P. 


Proof. If S is related to P(x,,---,x,,) as in the theorem then 
X ES <> (AX 45°0°5 Xm) [X = P(X, °°, Xm) |. 
Conversely, let 
x ES <>(4X1,°°+; Xm) [O(%, X15 °°°s X,) = O]. 


Let P(x, X1,°+';Xm) = x[1 — Q7(x, x1, -*+,X,)]. Then, if xeS, choose x,,°++,X,_, such 


1973] HILBERT’S TENTH PROBLEM IS UNSOLVABLE 239 


that O(x,%x1,°°',X,) = 0. Then P(x, x,,°*:;X,) =x; 80 x isin the range of P. Onthe 
other hand, if z = P(x,x,,°-',x,,), z > 0, then Q(x, x,,°*:,X,) must vanish (otherwise 
1—Q? <0) so that z=x and xéS. 


2. Twenty-four easy lemmas. The first major task isto prove that the exponential 
function h(n, k) = n* is Diophantine. This is the hardest thing we shall have to do. 
The proof is in §3. In this section we develop the methods we shall need, using the 
so-called Pell equation: 


where 


x*—dy*=1, x,y20, 
| : 


d=a*—-1, a>l. 


Although this is a famous equation with a considerable literature,” a self-contained 
treatment is given. Note the obvious solutions to (*): 


x=1 y=0 
x=a y=1. 


LEMMA 2.1. There are no integers x,y, positive, negative, or zero, which satisfy 
(*) for which i <xt+y/d<a+/d. 


Proof. Let x,y satisfy (*). Since 
1 =(a+J/a)(a—/d) = (x + yJ/a)(x — y Jd), 


the inequality implies (taking negative reciprocals) —1<-x+y,/ d<—a+ Jd. 
Adding the inequalities: 0 < 2y /d <2,/d, ie., 0<y <1, a contradiction. 


LEMMA 2.2. Let x,y and x',y’ be integers, positive, negative, or zero which 
satisfy (*). Let 


x" + y"/d =(x + ya/d) (x’ + y’ ./d). 
Then, x", y” satisfies (*). 


Proof. Taking conjugates: x” — y",/d =(x — y/d) (x' — y’./d). Multiplying 
gives: 


(x")? — diy")? = (x? — dy*) (x)? — a(y')”) = 1. 
DEFINITION. X,(@), y,(a) are defined for n 20, a > 1, by setting 
Xq(a) + y,(a)./d = (a +./d)". 


Where the context permits, the dependence on a is not explicitly shown, writing 
Xn Yn ° 


LEMMA 2.3. X,,), Satisfy (*). 


240 MARTIN DAVIS [March 


Proof. This follows at once by induction using Lemma 2.2. 
LemMA 2.4, Let x,y be a non-negative solution of (*). Then for some n, x =x,, 
Y=Sn- 


Proof. To begin with x + yJd = 1. On the other hand the sequence (a + Jd)" 
increases to infinity. Hence for some n 2 0, 


(a+. /d)"sxt+yJ/d<(a+Jd)"*'. 
If there is equality, the result is proved; so suppose otherwise: 
X,+ Ya/d <x+ yJd < (x, + Ynr/d) (a + /d). 


Since (x, + Yr Jd) (, —Vn Jd) = 1, the number x, — y,/d is positive. Hence, 
L<(x+y./d) (%,- yn Jd) <a+./d. But this contradicts Lemmas 2.1 and 2.2. 
The defining relation: 


X,+ Yaa =(a+ Jd)" 
is a formal analogue of the familiar formula: 
(cosu) + (sinu)./ — 1 =e" =(cos1 +(sin1) ,/ — 1)", 


with x, playing the role of cos, y, playing the role of sin and d playing the role of —1. 
Thus, the familiar trigonometric identities have analogues in which —1 is replaced 
by d at appropriate places. For example the Pell equation itself 


x2 — dy? =1 


is just the analogue of the Pythagorean identity. Next analogues of the familiar 
addition formulas are obtained. 


LEMMA 2.5. Ximan = XmXn£ AVnVm ANd Vingn = XnVm © XmVn- 


Proof. 
Xmtn Vman Jd = (a + Jayn* 
= (Xm + Ym) (Xn + Yn Jd) 
= (XmXn + AV Vm) + %nYm + XmYn) Vd. 
Hence, 
Xm+n = Xm*n + AY nV 


Ymtn = XnVm + XmYn- 
Similarly, (Xm—n + Ym—nV4) (Xq + YnVd) = Xm + Yma/d. So 
Xm-n + Ym-nVa = (Xin + Ym /d) (X, ~ y,V4), 


and one proceeds .as above. 


1973] HILBERT’S TENTH PROBLEM IS UNSOLVABLE 241 


LEMMA 2.6. Vint =@ Vm + Xmy ANd Xm 41 = AXm ~ AV. 


Proof. Take n = 1 in Lemma 2.5. 
The familiar notation (x, y) is used to symbolize the g.c.d. of x and y. 


LEMMA 2.7. (X,,),) = 1. 
Proof. If d| x, and d| y,, then d| x? — dy?,ie., d| 1. 
LEMMA 2.8. Yal Yue: 


Proof. This is obvious when k = 1. Proceeding by induction, using the addition 
formula (Lemma 2.5), 


Yn(m+ 1) = XnYnm + Xam) n° 
By the induction hypothesis Yn| Yam» Hence, Ya} Yacm+1): 
LEMMA 2.9. y,|y; if and only if n| t. 


Proof. Lemma 2.8 gives the implication in one direction. For the converse’ 
suppose Ya| y, but n}t. So one can write t=ng +r, 0<r<n. Then, 


Vt = XVng + Xng)r- 


Since (by Lemma 2.8) Val Yng» it follows that Va| Xngvre But (VnsXnqg) = 1. Uf d | Vas 
d|x,q, then by Lemma 2.8 d| y,, Which, by Lemma 2.7, implies d = 1.) Hence y,| y,. 
But, since r < n, we have y, < y, (e.g., by Lemma 2.6). This is a contradiction. 


LemMMA 2.10. y,, =k x¥~*y, mod (y,)?. 

Proof. 

(a +./d)™ 
(X, + Yn Ja)" 


k k 
= (j)se?yia 
j=0 \J 


Xnk + Vuk Jd 


l 


So, 
k k ay 
Vor = ( . Jen’ ya 
j=1 \J 
j odd 
But all terms of this expansion for which j > 1 are = 0 mod (jy,)?. 
LEMMA 2.11. Yn | Yaype 
Proof. Set k = y, in Lemma 2.10. 


LEMMA 2,12. If ys, | Yr, then y,,| t. 


242 MARTIN DAVIS [March 


Proof. By Lemma 2.9, n|t. Set t=nk. Using Lemma 2.10, yz|k Xn “Yas ie., 
y,| kx, *. But by Lemma 2.7, (y,,x,) = 1. So, y,| k and hence y,| t. 


LEMMA 2.13. X,4, =2ax, —X,—,; and Yya1 = 2ay, — Yn—1- 
Proof. By Lemma 2.6, 
Xnt1 = 4X_+4Yn, — Vnet =n + Xn, 
Xn—1 = AX, — dn, Yn-1 = Wn — Xn- 
SO, Xna1 tH Xn-1 = 2OXq, Vas + Vn-1 = 24Yp. 


These second order difference equations, together with the initial values x, = 1, 
X1 =a, Yo = 9, y, =1, determine the values of all the x,, y,. Various properties of 
these sequences are easily established by checking them for n = 0, 1 and using these 
difference equations to show that the property for n+ 1 can be inferred from its 
holding for n and n — 1. Some simple (but important) examples follow: 


LEMMA 2.14. y, =n mod a — 1. 


Proof. For n = 0,1 equality holds. Proceeding inductively, using a = 1, mod 
a—1: 
Ynt1 = 2ayn — Vn-1 


2n—(n—1) mod a—l. 


LEMMA 2.15. Ifa =b mod ¢, then for all n, 
Xn(4) = Xn(b), Yn(4) = Ya(b) mod c. 

Proof. Again for n = 0,1 the congruence is an equality. Proceeding by induction: 
Yn+s(@) = 2ay,(a) — y,~1(a) 

2by,(b) — y,-1(b) mod c 

= Jy+1(). 


LEMMA 2.16. When n is even y, is even and when n is odd y, is odd. 


Proof. Yn+1 = 24Vn — Yn—-1 = Yn—-1 Mod 2. So when n is even, y, = Vo = 90 mod 2, 
and when n is odd, y, = y,; = 1 mod 2. 


LEMMA 2.17. x,(a) — y,(a)(a — y) = y" mod 2ay — y? — 1. 


Proof. Xo — yo(a — y) = 1 and x, — y,(a — y) = y, so the result holds for n =0 
and 1. Using Lemma 2.13 and proceeding by induction: 


2a[Xp _ Yala ~ y)] _ [Xn—1 _ Yn—1(a _ y)] 


n—1 


Xn+1 — Yn+i(a — y) 


2ay"—y 


1973] HILBERT’S TENTH PROBLEM IS UNSOLVABLE 243 


= y'*(2ay — 1) 
= yt y? 
_— yrtt, 

LEMMA 2.18. For all n, y,41 >), ZN. 


Proof. By Lemma 2.6, y,41; > Y,. Since yo = 02 O, it follows by induction that 
y, =n for all n. 


LemMA 2.19. For all n, X,+,(a)>x,(a)2 a"; x,(a) S (2a)". 


Proof. By Lemmas 2.6 and 2.13 a x,(a) S X,41(@) S (2a)x,(a). The result follows 


by induction. 
Next some periodicity properties of the sequence x, are obtained. 


LEMMA 2.20. X24; = — x; mod x,. 
Proof. By the addition formulas (Lemma 2.5) 


Non+j = XnXnaj + WaVnaj 


AY AVnX; 2 Xnyj) mod Xn 


dyx, mod x, 


(x5 — 1)x; 


mod x,,. 


= —-x, 


J 
LEMMA 2.21. X4n4; =X; mod x,. 
Proof. By Lemma 2.20 

mod x,,. 


Nan+j = X2n+j =X; 


LEMMA 2.22. Let x;=x, mod x,,iSjS$2n, n>0. Then i=j, unless a =2, 
n=1, i=0 and j =2. 


Proof. First suppose x, is odd and let q=(x,-—1)/2. Then the numbers 
—q, —q+1, —q+2,-::, —1, 0, 1,---,q¢ —1, q are a complete set of mutually 


incongruent residues modulo x,. Now by Lemma 2.19, 
1=X9 <x, <°' <X,-1. 
Using Lemma 2.6, x,-, SX, /a $4x,3;80X,—, Sq.Also by Lemma 2.20, the numbers 
Xnt+19Xnt29°°'sX%2n—19%2n 
are congruent modulo x, respectively to: 


—~ Xn-1> 7 Xn—2a9°''s 7X1, ~ XO F I. 


244 MARTIN DAVIS [March 


Thus the numbers Xo, X,, X2,°°:,X2, are mutually incongruent modulo x,. This gives 
the result. 
Next suppose x, is even and let q = x, /2. In this case, it is the numbers 


—qt 1, —q+2,°, _ 1,0,1,-+-,g _ 1,q 


which are a complete set of mutually incongruent residues modulo x,. (For, — q =q 
mod x,.) As above, x,.; Sq. So the result will follow as above, unless x,_, =4q 
= x,/2, so that x,., = —q mod x,, in which case i=n-—1, j=n+1 would 
contradict our result. But, by Lemma 2.6, 


Xn = AX n-1 + AVn—15 


so that x, = 2x,—, implies a = 2 and y,_, =0,1.e., 2 =1. So the result can fail only 
for a=2, n=1 and i=0, j =2. 


LeMMA 2.23. Let x;=x; mod x,, n>0, 0<isgn, OSj<4n, then either 
j=i or j =4n —i. 


Proof. First suppose j $ 2n. Then by Lemma 2.22, j =i unless the exceptional 
case occurs. Since i > 0, this can only happen if j = 0. But then 


i=2>1=n. 


Otherwise, let j > 2n and set j= 4n —j so 0 < j < 2n. By Lemma 2.21,x; =x, 
= x,mod x,. Again j =i unless the exceptional case of Lemma 2.22 occurs. But this 
last is out of the question because i, j > 0. 


Lemma 2.24. [f0<isnand x;=x;mod x,, then j = +i mod 4n. 
Proof. Write j = 4nq + j, OS j<4n. By Lemma 2.21, 
X;=xX;=xX; mod x,. 
By Lemma’2.23 i= j or i=4n — j. So, j = j= ti mod 4n. 


3. The exponential function. Consider the system of Diophantine equations: 


(1) x? — (a*— ly? =1 
(II) u* — (a*—1)v* =1 
(IIT) s? — (b?—1)t? =1 
(IV) vy = ry? 

(V) b = 1+4py=a+qu 
(VI) Ss =x-+cu 


(VID t 


k+4(d—1)y 


1973] HILBERT’S TENTH PROBLEM IS UNSOLVABLE 245 
( VITT) y = k+e-1. 
Then it is possible to prove: 


THEOREM 3.1. For given a, x, k, a>1, the system I-VIII has a solution in the 
remaining arguments y, u,v, Ss, t, b, r, p,q, c, d, e ifand only ifx = x,(a). 


Proof. First let there be given a solution of I-VIII. By V, b > a > 1. Then I, II, 
III imply (by Lemma 2.4) that there are i, j, n > 0 such that 


x =x;,(a), y = y{(a), U = X,(4), 0 = Yq(4), S = x,(b), t = yj(d). 
By IV, ys vsothatisn. V and VI yield the congruences 
b=a mod x,(a); x,(b)=x,(a) mod x,(a) 
and by Lemma 2.15 one gets also 


xj(b) =x,(a) mod x,(a). 
Thus, 
x(a) =x,(a) mod x,(a). 


By Lemma 2.24, 
(1) j=x2+i mod 4n. 
Next, equation IV yields 
(y;(4))? | ¥n(a). 
so that by Lemma 2.12, 
y,(a)| n 

and (1) yields: 
(2) j=xti mod 4y,(a). 
By equation V 

b=1 mod 4y,(a), 
so by Lemma 2.14, 


(3) y,(b)=j mod 4y,(a). 
By equation VII, 
(4) y;(b)=k mod 4y,(a). 


Combining (2), (3), (4), 


(5) k=+i mod 4y,(a). 


246 MARTIN DAVIS [March 


Equation VIII yields 


k S y,(a) 
and by Lemma 2.18, 
iS y;(@). 
Since the numbers 
—2y+1, —2y +2,--, —1,0,1,°-:,2y 


form a complete set of mutually incongruent residues modulo 4y = 4y,(a), these 
inequalities show that (5) implies k = i. Hence 


x = Xx;(a) = x,(a). 


Conversely, let x= x,(a). Set y = y,(a) so that I holds. Let m = 2ky,(a) and let 
U = X,(a), V =Y_(a). Then II is satisfied. By Lemmas 2.9 and 2.11 y?/v. Hence one 
can choose r satisfying IV. Moreover by Lemma 2.16, v is even so that u is odd. By 
Lemma 2.7, (u,v) = 1. Hence (u,v 4y) = 1. (If pis a prime divisor of u and of 4y, then 
p|y because u is odd, and hence p|v since y|v.) So by the Chinese Remainder 
Theorem (Theorem 1.2), one can find by such that 


bb =1 mod 4y 
bj =a mod u. 


Since by + 4juy will also satisfy these congruences, b, p,q satisfying V can be found. 
III is satisfied by setting s=x,(b), t= y,(b). Since b>a, s =x,(b) > x,(a) =x. 
By Lemma 2.15 (using V), s = x mod u. So ccan be chosen to satisfy VI. By Lemma 
2.18, t= k and by Lemma 2.14, t= k mod b —1 and hence using V, t= k mod 4y. 
So d can be chosen to satisfy VIT. By Lemma 2.18 again, y=k, so VIII can be 
satisfied by setting e=y—k+1. 


COROLLARY 3.2. The function 


g(z,k) = x,(z + 1) 
is Diophantine. 


Proof. Adjoin to the system I-VIII: 
(A) a=ztl. 


By the theorem, the system (A), I- VIII has a solution if and only if x = x,(a) = g(z, k). 
Thus a Diophantine definition of g can be obtained in the usual way by summing 
the squares of 9 polynomials. 

Now at last it is possible to prove: 


THEOREM 3.3. The exponential function h(n, k) = n* is Diophantine. 


1973] HILBERT’S TENTH PROBLEM IS UNSOLVABLE 247 
First, a simple inequality: 
LEMMA 3.4. Ifa> y", then 2ay — y* —1> y*. 


Proof. Set g(y)=2ay—y*?—1. Then (since a2=2) g(1)=2a—22a. For 
1<y <a, g'(y)=2a—2y>0. So g(y)2a for 1< y<a. Then fora>y“2y, 
Qay—y?-1lza>y*. 

Now, adjoin to equations I-VIII: 


IX (x — y(a — n) — m)? =(f— 1)?(2an — n? — 1)? 
xX m+g=2an—n?—1 

XI w=nth=k+l 

XII a? — (w? — 1) (w— 1)?z? = 1, 


Theorem 3.3 then follows at once from: 


LemMaA 3.5. m =n* if and only if equations I-XII have a solution in the remain- 
ing arguments, 


Proof. Suppose I-XII hold. By XI, w>1. Hence (w —1)z >0 and so by XII 
a >1.So Theorem 3.1 applies and it follows that x = x,(a), y = y,(a). By IX and 
Lemma 2.17, 


m=n* mod 2an —n? —1. 
XI yields 
kn<w. 
By XII (using Lemma 2.4), for some j, a = x,(w), (w — 1)z = y,(w). By Lemma 2.14, 


j=0 modw-I1 


so that j 2w-—1. So by Lemma 2.19, 


Now by X, m < 2an — n* —1, and by Lemma 3.4, 
n* <2an — n* —1. 


Since m and n* are congruent and both less than the modulus, they must be equal. 

Conversely, suppose that m = n*. Solutions must be found for I-XII. Choose any 
number w such that w>n and w>k. Set a=x,_,(w) so that a>1. By Lemma 
2.14, 


Yw-1(w) = 0 mod w-—1. 


248 MARTIN DAVIS [March 
So one can write 
Yw—1(W) = 2(w — 1); 
thus XII is satisfied. XI can be satisfied by setting 
h=w-n, l=w-—k. 
As before, a > n* so that again by Lemma 3.4, 
m =n" <2an—n?—1 


and X can be satisfied. Setting x = x,(a), y = y,(a), Lemma 2.17 permits one to 
define f such that 


x — y(a-—n)—m=+(f-—1)Qan — n?-1), 
so that IX is satisfied. Finally, I-VIII can be satisfied by Theorem 3.1. 


4, The language of Diophantine predicates. Now that it has been proved that the 
exponential function is Diophantine, many other functions and sets can be handled. 
As an example, let 


h(u,v,w) =u". 
The claim is that } is a Diophantine function. For: 
y=u (4z) (y=wW& z =v"), 


Where ‘‘&’’ is the logician’s symbol for ‘‘and’’. Using Theorem 3.3, there is a 
polynomial P such that: 


y = ute (Ary, +,7,) [PO us 2549s M) =O], 
z =v" <> (ds,, °°, 5,) | P(z, 0, W, 51, °°°,5,) = O]. 
Then, 
y=u' (az, 15° Tas S19 °° Sn) LP°(Y, Uy 25 P19 00s Pn) 
+ P?(z,0,W, 51, °*',S,) =O]. 


Now this procedure is perfectly general: Expressions which are already known 
to yield Diophantine sets may be combined freely using the logical operations of 
“*&’’ and ‘‘(4)’’; the resulting expression will again define a Diophantine set. (Such 
expressions are sometimes called Diophantine predicates.) In this ‘‘language’’ it is 
also permissible to use the logician’s “‘\V’’ for ‘‘or’’, since: 


(Ari, 7) [Pi = 0] V (A845 *++s Sin) [P, = 0] 
<> (71.086. Sas St. °%'s Sn) [P,P, = 0]. 


1973] HILBERT’S TENTH PROBLEM IS UNSOLVABLE 249 


Three important Diophantine functions are given by: 


THEOREM 4.1. The following functions are Diophantine: 


n 
(1) foa,k) = (4) 
(2) g(n) =n! 
(3) h(a, b, y) = l (a + bk). 


In proving this theorem the familiar notation [«], where a is areal number, will 
be used to mean the unique integer such that 


[ajsa<[o] +t. 
LemMMA 4.1. ForO<k<sn,u> 2" 
[ut+1)"/u*]= & ( je 
i=k 
Proof. 


(u+1)"/u* = y (7 Ju F=S4R 
i=0 


where 


<uitZ (”) 

i=0 , b, 
= u~'(1 +1)" 
< l. 


So, 
S<(uti)"/ub<S+1 


which gives the result. 


250 MARTIN DAVIS [March 


LemMA 4.2. ForO<k Sn, u>2’, 
. nk n 
[(u + 1)"/u"] =(1) mod uw. 
Proof. In Lemma 4.1 all terms of the sum for which i > k are divisible by u. 


LEMMA 4.3. f(n,k) = (i) is Diophantine. 


n — [Nn ; 
(i) 5,2 (7-2 <" 


Lemma 4.2 determines (;) as the unique positive integer congruent to 
[(u + 1)"/u*] modulo u and <u. Thus, 


Proof. Since 


NA 


Z =(j,) > Guo) (v=2 &u>v 


&w=[(ut+1)"/u*]&z=w mod u&z <u). 


To see that (;) is Diophantine, it then suffices to note that each of the above 
expressions separated by ‘‘&’’ are Diophantine predicates; v =2”is of course Diophan- 
tine by Theorem 3. The inequality u >v is of course Diophantine since u >v<> 
(Ax)(u =v +x). Also, 


z=wmodu & z<u<(Ax,y)(w=z+(x—-—lu&u=Zz+y). 


Finally 
w=[(u + 1)"/u*] 


=> 
(x,y) (t=ut1 &x="&y=u'&wsx/y<w+l), 
and wsx/y<wt+tl<ewysx<(wttl)y. | 


LemMa 4.4. [fr > (2x)**! then 


Proof. Let r > (2x)***. Then, 


x(t) _ r*x! 
ri() = @ope eos ED 


1973] HILBERT’S TENTH PROBLEM IS UNSOLVABLE 


i 
< x! er 
(7 
r 
Now, 
; (=) 
= 1+—+ {—] + 
x r 
_ 
r x x x \2 
= 14+ [1+2+(=) + | 
r r 
x 
<1+— {Lt+t+3+--} 
=147% 
r 
And, 
Lx x J 
r j=0 \J r 
< 142% >» (>) 
ro j=1 \J 
<14+2% 2* 
r 
So, 
ri( | < x! p 2%. xt 2% 
x r 
Qxtlyxt+l 
< x! + —- 
r 
< x!+1. 


LemMMA 4.5. n! is a Diophantine function. 
Proof. m=n! <> 


(dr,s,t,u,v) {s =2x+1&t=x+1&r=s' 


&u =r &v = ) & mo <u < (m + 10} 


r 
n 
LEMMA 4.6. Let bg =a mod M. Then, 


y 
I] (a + bk) = by!(4 7”) mod M. 
k= 


251 


252 MARTIN DAVIS [March 


Proof. 


yy (Zty\ _ py — {yee 
by! ( a b(q+yy)(qt+ty—1):-@ +1) 


(bq + yb) (bq + (y — 1)b)--- (bq + 5) 
= (a+ yb)(a+(y—1)b)---(a+b) (mod M). 
Lemma 4.7. h(a,b,y) = | [g=1 (a + bk) is a Diophantine function. 


Proof. In Lemma 4.6 choose M =b(a+by)’?+1. Then, (M,b)=1 and 
M > []h 21 (a + bk). Hence the congruence bq = a mod M is solvable for q and then 
[[K=1 (a + bk) is determined as the unique number which is congruent modulo M 


to b’y! (" ”) and is also <M. Le, 


Zz =I (a + bk) <>(4M, p,q, 1,5, t, u,v, W, x) 
[r= a+ by &s =P & M=bs+1 
&bg=at+Mt&u=b’&v=y!&2<M 
&w=qty&x= (‘) ) &2 + Mp =wox 


Using the previous expressions for the exponential function, for v = y! and for 
x = (5), we obtain the result. 
The assertion of Theorem 4.1 is contained in Lemmas 4.3, 4.5, and 4.7. 


5. Bounded quantifiers. The language of Diophantine predicates permits use of 
&, \/, and 4. Other operations used by logicians are: 


~ for ‘‘not’’ 
(Vx) for ‘for all x” 


+ for “if--, then +” 


However, as will be clear later, the use of any of these other operations can lead to 
expressions which define sets that are not Diophantine. There are also the bounded 
existential quantifiers: 


““(Ayyex tt” which means (dy) (y a xX & +++)” 


and the bounded universal quantifiers: 


““(Wy)e,s’ Which means “(Vy) (y>xV---)’’. 


1973] HILBERT’S TENTH PROBLEM IS UNSOLVABLE 253 


It turns out that these operations may be adjoined to the language of Diophantine 
predicates; that is, the sets defined by expressions of this extended language will 
still be Diophantine. Le., 


THEOREM 5.1. If P is a polynomial, 
R = {CY,% 15°05, %n> | (32) sy(AV 1000 Yn) [POs 2% 150s Xn Vt9 1's Vm) = OT} 
and 
S = {CY X15 015 %n>| (VZ) sy(AV 15 05 Ym) LPO, 2 150s Xn 19°01» Ym) = OT, 
then R and S are Diophantine. 
That R is Diophantine is trivial. Namely, 
CVs X15 119 XP ER <> (AZ, V1, +5 Ym) (2 SY &P =). 
The proof of the other half of the theorem is far more complicated. 
LEMMA 5.1. 
(Vk) <,(4y1, Vm) LP(Y, Ks X15 8s Xn Vi0°s Yn) = O] 
<> 
(Au) (Vk) (AV 1500+) Ym) sul PVs ky X15 008s Xs V1 00s Ym) =O]. 


Proof. The right side of the equivalence trivially implies the left side. For the 
converse, suppose the left side is true for given y, x,,-°-,x,. Thenforeach k = 1,2,---,y 
there are definite numbers y?, vee, yw) for which: 


PCY, K, X45 °°*5 Xm Via ts Ym) = 0. 
Taking u to be the maximum of the my numbers 

{SP| J =A-yms k= 12,0, 9}, 
it follows that the right side of the equivalence is likewise true. 


Lemma 5.2. Let O(y,u,x1,°°:,X,) be a polynomial with the properties: 


(1) QO(Y, U,X1,°°'5Xy) > U, (2) O(V,U,X 15°, Xn) > Vy 
(3) k<syand Vio°*'s ma uimply | PCY, Ky X15 °0°s Xs V15 0s Vn) | SS D(Vs Uy X45 085 Xp) 
Then, 


(Vk) <,(4y1; ney Vm) <ul PY, k,X15 "°° Nn Vis’ °'s Vm) = 0] 
=> 
y 


(Je, t,ay,°°54) [1 +et= [] +kt 


k=1 


254 MARTIN DAVIS [March 


&t = O(y,u,X1,°°°,X,)!&l +ct 


Il (a; —j) 


&+ & il +ct 


I] (Qn — Jj) 
j=1 
& P(Y, C, X15 °°; Xo 415 °°*s5 Im) =O mod 1+ct]. 


The point of this lemma is that while the right side of the equivalence seems the 
more complicated of the two, it is free of bounded universal quantifiers. 


Proof. First the implication in the < direction: 

For each k = 1,2,---, y, let p, be a prime factor of 1 + kt. Let yi be the remainder 
when a; is divided by p, (k = 1,2, -:-,y; i= 1,2,---,m). It will follow that for each 
k, i: 


(a) Lsy"su 
(b) P(y, k, X15 Xun oa) *) = 0. 


To demonstrate(a), note that p,|1 + kt, 1+ kt] 1+ ct and 1+ ct |[]j-1(a@,—j). Le, 
p,|T]j=1 (4: — 7). Since p, is a prime, p,| a; —j for some j = 1,2,---,u. That is 


j=a,=y; mod p,. 


Since t = Q(y,u,X,,°°*,x,)!, (2) implies that every divisor of 1+ kt must be 
> O(y, u,X1, “+, Xq). So Pe > Oy, u,X1, “15 Xn) and by (1), Dy > U. Hence j Ss u< Dy. 
Since y{is the remainder when a, is divided by p,, also y < p,. So, 


yp sj. 
To demonstrate (b), first note that 

{+ct=1+kt=0 mod p,. 
Hence 

k+kct=c+kct mod p,, 
1e., kK =c mod p,. We have already obtained 

yp =a, mod Pr. 

Thus, 


k k — 
P(y, k,X1, 5Xny VG ,, oa) y?) = PY, Ci X15 °°, Xy. Ay, 95 An) 


0 mod p,. 
Finally 


1973]. HILBERT’S TENTH PROBLEM IS UNSOLVABLE 255 


| POs k, X45 07 Xe YY, voy YO) | S Oy,u, x1, “+, Xq) < Dy: 


This proves (b) and completes the proof of the < implication. 
To prove the => implication, let 


Py, k,X1, Xn VY, oe, yO) = 0, 


for each k = 1,2,-+-,t, where each y< u. Weset t = Oly, u, x1, +++, x,)!, and since 
[}i-1(1 + kt) = 1 mod t, we can find c such that 


y 
L+et=[] (1+k0). 
k=1 
Now, it is claimed that for 1 Sk<Il sy, 
(1+kt,1+4 It) =1. 


For, let P| 1+ kt, p| 1 + lt. Then P| I—k, so p< y. But since Q(y, u, x,,°+°,x,) > y 
this implies P| t which is impossible. Thus the numbers 1 + kt form an admissible 
sequence of moduli and the Chinese Remainder Theorem (Theorem 1.2) may be 
applied to yield, for each i, 1 S$ i S$ m, a number a; such that 


a, = ymodit+kt, k= 1,2,---,y. 
As above, k = c modi+kt. So 
PCY, C, X 4508's Kno 415 °*% Im) = POY, ky X45 005 Xs VE, 0, YO?) mod { + kt, 
= 0. 


Since the numbers 1+kt are relatively prime in pairs and each divides 
P(y, C, X13 °'%s Xny 445 °*', 4m) SO does their product. Le., 


PCY, C, X19 °*%s Xn 445 °°'5 Gy) = O modi + ct. 
Finally, 
a, = y'” mod 1 + kt, 
1.€., 


Since 1 < y <u, 
u 
1 + kt | I] (a; —j). 
j=! 
And again since the 1 + kt’s are relatively prime to one another, 


1+cet| TI (a; — j). 


j=i 


256 MARTIN DAVIS [March 


Now it is easy to complete the proof of Theorem 5.1 using Lemmas 5.1 and 5.2. 
First find a polynomial Q satisfying (1), (2), (3) of Lemma 5.2. This is easy to do: 
Write 


N 
P(Y, KX 15°95 Xn Vis Vn) = x l, 


r= 


— 


where each t, has the form 
_ ay-5 .41,.9 Gn ,,S1,,8 Sm 
l= lce|y ke xX Xn" Vy V2 Vin 


for c an integer positive or negative. Set u, = cy**?x4!x#?--- xy 727" and let 
U,. 


x 

N 

QO(y,U, X14, °°, Xn) =u +y + py 
r=1 


Then (1), (2), and (3) of Lemma 5.2 hold trivially. Thus: 
(Vk) <,(4y1, oa) Ym) [PV, k, X4; Xn Yi1> =*. Vn) = 0] 


=> 


y 
(du, c,t,a1,°°*, ay) [ +ct = I] (1 + kt) 
k=] 
&t = O(y,u,x1,°°5X,)!&1 +t] T] (a, -/) 
j= 


&-&i tet] T] (an —s) 
j=1 


& PCY, C,X15°°°s Xo 415°°'5 4m) =O mod L+et| 


=> 


(su, C, 1,4, = AneOA G15 "Im Ny, Sh, Ll) 


y 

Je=1+er&e = I] (1+ kt) &f = O(y, u, x1,°°°, X,) 
k=1 

&t=f!&g, =a, -—u—-—1&9,=a,-—u-1&:-&%G,, =a, —u—1 


&h, = I] (gi tk) &h, = I] (92 +k) 
k=1 


8 +++ & htm = [T] (Om +k) &e| hy &e| hy & + Kelh,, 
k=1 
& | = PCY, €,% 157%) X ps5 *7"y Aq) &E | 7 


and this is Diophantine by Theorem 4.1. _ 


1973] HILBERT’S TENTH PROBLEM IS UNSOLVABLE 257 


6. Recursive functions. So far one trick after another has been used to show 
that various sets are Diophantine. But now very powerful methods are available: it 
turns out that the expanded version of the language of Diophantine predicates, 
permitting the use of bounded quantifiers (sanctioned by Theorem 5.1) together with 
the Sequence Number Theorem (Theorem 1.3) enables one to show in quite a straight- 
forward way that almost any set we please is Diophantine. 

Some examples are in order: 

(i) the set P of prime numbers: 


xEPex>1 &(V),z)<.[yz<xVyz>xVy=1Vz=1]. 
Another Diophantine definition of the primes is: 
xeEPsx>1&((x—1)!,x)=1 
<x >1&(4y,2z,u,v)[y=x-1&2=y! &(uz — vx)? = 1]; 


but the first definition is the more natural one. 

From Theorem 1.4 it follows that there is a ‘“‘prime-representing’’ polynomial 
P, i.e., a positive integer is prime if and only if it is in the range of P. For an 
explicit construction of such a polynomial P, cf. [23a]. 

(ii) the function g(y)=|]j=1 (1 +k). Here we use the Sequence Number 
Theorem to ‘‘encode’’ the sequence g(1), g(2),---,g(y) into a single number u, i.e., 
so that 


S(i,u)=g(),  i=1,2,-,y. 
Thus, z = g(y) 
<>(Ju) {S(1,u) =2 &(Vk)<,[k =1V(S(k,u) = (1 + k®)S(k — 1,u))] &z = S(y,u)} 
<> (Ju) {S(1,u) =2 &(Vk)<,[k =1\ (4a,b,c) (a=k—-1 
& b = S(a,u) & ¢ = S(k,u) &e = (1 + k”)b)] & z = S(y, u)}. 


By now it is clear that the available methods are quite general. They are so 
powerful that the question becomes: how can any ‘“‘reasonable’’ set or function 
escape these methods, i.e., not be Diophantine? 

The strength of the methods can be tested by considering the class of all compu- 
table or recursive functions. These are the functions which can be computed by a 
finite program or computing machine having arbitrarily large amounts of time and 
memory at its disposal. Many rigorous definitions of this class (all of them equivalent) 
are available. One of the simplest is as follows: | 

The recursive functions® are all those functions obtainable from the initial 
functions 


258 MARTIN DAVIS [March 
c(x) = 1, s(x) =x +1; Uj (X45 +++, Xq_) = Xj, 1Sisn,; 
S(i,u) (The sequence number function)* 


iteratively applying the three operations: composition, primitive recursion, and 
minimalization defined below: 


COMPOSITION yields the function 
h(x1, Xy) = f(91(X1, + Xy)s Im(X15 Xn) 
from the given functions g,,-:-,g,, and f(t1,°°-, tn): 


PRIMITIVE RECURSION yields the function h(x,,°::,x,,2) Which satisfies the 
equations: 


h(x,, "'*y Xn 1) = f(x1, 5X,) 
h(x, yXngb + 1) = g(t, h(x, 9 Xnyt), X15 76s Xpy)s 


rom the given functions f, g. 
When n = 0, f becomes a constant so that / is obtained directly from g. 


MINIMALIZATION yields the function: 
h(x, Xn) = min,[ f(1, Xn V) = g(X1, Xn V)I 


from the given functions f, g assuming that f, g are such that for each x,,---,x, there 
is at least one y satisfying the equation f(x,,°-+,X,,Y) = 9(X1,°"'s Xn); (i.e., h must 
be everywhere defined). 

The main result of this article is: 


THEOREM 6.1. A function is Diophantine if and only if it is recursive. 


To begin with, consider the following short list of recursive functions: 
(1) x + y is recursive since 


x+1=s(x), 
x+(t+1)=s(x +1) =g(t,x + t,x), 


where g(u, v, w) = s(U3(u, v, w)). 
(2) x + y iS recursive since 


x-1=U;(x) 
x'(t+1=(x-'th+x=g(t,x: t,x), 


where g(u,v, w) = U3(u,v,w) + U3(u, v, w). 


1973] HILBERT’S TENTH PROBLEM IS UNSOLVABLE 259 


(3) For each fixed k, the constant function c,(x) = k is recursive, since c,(x) is 
one of the initial functions and c,, (x) =c,(x) + c(x). 

(4) Any polynomial P(x,,--:,x,,) with positive integer coefficients is recursive, 
since any such function can be expressed by a finite iteration of additions and mul- 
tiplications of variables and c(x). E.g., 

2x7y +3xz> +5 =c,(x)*x° x y+e3(x)'x°72°2°2+65(x). 


So (1), (2), (3), and composition gives the result. 
Now it is easy to see that every Diophantine function is recursive: 
Let f be Diophantine, and write: 


Jy =f (x1, *5X,)<>(3t,, “+5 bin) [P(x, Xn Volt, “+ tn) 
= Q(X 15° Xn Vs tis 5 bn) |, 


where P, Q are polynomials with positive integer coefficients. Then, by the sequence 
number theorem: 


I (x1, “+5 Xnq) 


S(1, min,| P(x,, --:,x,,S(, 4), S(2, u), ---, S(m + 1, u)) 
QO(X1,°°'5X,, SU, 4), SQ, uv), ---, S(m +1, u)))). 


Since P, Q, S(i, u) are recursive, so is f (using composition and minimalization). 

To obtain the converse: S(i,u) is known to be Diophantine; the other initial 
functions are triyially Diophantine. Hence it suffices to prove that the Diophantine 
functions are closed under composition, primitive recursion and minimalization. 


Composition: If h(%1,°+*,X,) =f(G1(%15 °° Xn)o*°"s Gm(X1>°°°s Xn), Where f,g,, 
‘+59, are Diophantine, then so is h since 


y = A(X, 01, X%q) > (Atay 05 tn) [ty = 1% 1500s Xp) BE 
Stn = Im(X15°** Xn) KY =S(t1, ++ tm). 
Primitive Recursion: If | 
h(%1,°°°,Xns1) = f(%1, °°°5 X,) 
A(X 15 °°°5Xno C71) = g(t, W(% 1,00 Xns ty X15 00s Xn)> 


and f, g are Diophantine, then (using the sequence number theorem to ‘‘code’’ the 
numbers h(X,,°+°,X,,51), A(X 15°+8s Xs 2s 00's AX 15 °° X ys Z)! 


Y = NX 15°11, Xn Z) > 
(Ju) {(av) Lv = S(1,u) &v = fy, +, x,)] 
& (Vt) <.[(t =z) V (2) (v = S(t +1,u)_ 
&v = g(t, S(t,u), X1,°+:,xX,))] & y = S(z, u)} 


260 MARTIN DAVIS [March 


so that (using Theorem 5.1) h is Diophantine. 
Minimalization: If 
A(X1, +++, Xq) = mMing[ £(% 1, °°, %yoY) = (Ki Xn VDI, 
where f, g are Diophantine, then so is h since, 
y = h&1, +++, X,) > 
(dz) [Zz =f (1, 07+, Xn VY) &Z = GH 15° Xn) 
& (Wt)<y[(t = y) V (au, v) @ =f(X1, +++, Xn52) 
&v0 = G(X1,°°°,X,,th&(u<vVv<uy)]. 


7, A universal Diophantine set. An explicit enumeration of all the Diophantine 
sets of positive integers will now be described. Any polynomial with positive integer 
coefficients can be built up from 1 and variables by successive additions and multiplica- 
tions. We fix the alphabet 


Xoo X15 X45 X 3, vee 


of variables and then set up the following enumeration of all such polynomials 
(using the pairing functions): 


P, = 1 
P3i-, = Xj-1 
Ps, = Pry + Pray 


P3is, = Pry’ Pry: 


Write P; = P;(Xo,X,,°°',X,), Where n is large enough so that all variables occurring 
in P; are included. (Of course P, will not in general depend on all of these variables.) 
Finally, let 


D, = {Xo| (4x4, Xn) [P icny(%os X15 Xn) = Preny(X%os X1; “+ Xq) |}. 


Here, Py,,)and Pr, do not actually involve all of the variables xo, x,, ---,x,— but 
clearly cannot involve any others. (Recall that L(n), R(n) Sn.) By the way the se- 
quence P;, has been constructed, it is seen that the sequence of sets: 


D,, Do, D;, D4, “ee 
includes all Diophantine sets. Moreover: 
THEOREM 7.1 (Universality Theorem’). 


{<n, x>| xéD,} is Diophantine. 


1973] HILBERT’S TENTH PROBLEM IS UNSOLVABLE 261 


Proof. Once again using the sequence number theorem, it is claimed that: 
x €D, <>(du) {S,u) = 1&S(Q2,u) =x 
& (Vi) <n[S(3i,u) = S(L(@), u) + S(RG), 4)] 
& (Vi) <,[SGi + 1,u) = S(L(D),u) - S(R}),4)] 
& S(L(n), u) = S(R(n), u)}. 


It is clear enough that the predicate on the right-hand side of this equivalence is 
Diophantine, so it is only necessary to verify the claim: 

Let xeED, for given x, n. Then there are numbers 1¢,,-:-,t, such that 
Picny (Xs tt, stn) = Qrcny (% t1,°*'5 ty). Choose u (by the sequence number theorem) 
so that 


(*) S(j,u) = P(X, t1,°+5 tn)» jJ= 1,2,++-,3n + 2. 


Then in particular S(2,u) =x and S(3i —1,u) =t;_,, i=2,3,-:-,n +1. Thus the 
right-hand side of the equivalence is true. 
Conversely, let the right-hand side hold for given n, x. Set 


t, = S(5, 4), t, = S(8, 4), -°-,t, = Sn + 2, u). 
Then, (*) must be true. Since S(L(n), u) = S(R(n), u), it must be the case that 
P icny(% b1, sty) = Prony(X, b1; sth), 


so that xeD,. 
Since D,,D,, D3, ---, gives an enumeration of all Diophantine sets, it is easy to 
construct a set different from all of them and hence non-Diophantine. That is, define: 


V ={n| n¢D,}. 
THEOREM 7.2. V is not Diophantine. 


Proof. This is a simple application of Cantor’s diagonal method. If V were 
Diophantine, then for some fixed i, V = D,;. Does ie V? We have: 


ie VsieD,;; ie Vi D,. 
This is a contradiction. 

THEOREM 7.3. The function g(n,x) defined by: 
g(n,x)=1 if x€D,, 
g(n,x)=2 if xeD,, 

is not recursive. 


Proof. If g were recursive then it would be Diophantine (Theorem 6.1), say: 


262 MARTIN DAVIS [March 


y= g(n, X) <> (AY 1, °°'5 Ym) [P(n, x,y, V1; =. Vn) = 0}. 
But then, it would follow that 
V= {x| (Ay 1.°°*s Vm) [ P(x, x, 1,915°°'s Vm) = 0}} 


which contradicts Theorem 7.2. 
Using Theorem 7.1, write: 


xe D,<(42,, v0, Zp) [ P(n, XxX, Z15 very Z,) = 0]. 
where P is some definite (though complicated) polynomial. Suppose there were an 
algorithm for testing Diophantine equations for possession of positive integer 


solutions; i.e., an algorithm for Hilbert’s tenth problem! Then for given n, x this 
algorithm could be used to test whether or not the equation 


P(n, X, Z1,°**,Z,) =9 


has a solution, i.e., whether or not xe D,. Thus the algorithm could be used to 
compute the function g(n, x). Since the recursive functions are just those for which a 
computing algorithm exists, g would have to be recursive. This would contradict 
Theorem 7.3, and this contradiction proves: 


THEOREM 7.4. Hilbert’s tenth problem is unsolvable! 


Naturally this result gives no information about the existence of solutions for 
any specific Diophantine equation; it merely guarantees that there is no single 
algorithm for testing the class of all Diophantine equations. Also note that: 


xEV <> ~(424,°+, 2) [P(x, x, 21, °°', Z,) = 0] 
<> {(42,,°°+,2,) [P(x, x, 21, °°', 2,) =0] 71 =0} 
<> (VZ1,°°*, 2x) [ P(X, x, 21, °°°, 2) > 0 
V P(x, x, 245°, 2.) < 0] 


which shows that if either ~ or unbounded universal quantifiers (Vz) or implication 
(—) are permitted in the language of Diophantine predicates, then non-Diophantine 
sets will be produced. 

It is natural to associate with each Diophantine set a dimension and a degree; 
1.e., the dimension of S is the least n for which a polynomial P exists for which: 


(*) S = {x] (Avis Yn) [P(X, 15°45 Yn) = OF}, 


and the degree of S is the least degree of a polynomial P satisfying (*) (permitting n 


to be as large as one likes). Now it is easy to see: 
THEOREM 7.5. Every Diophantine set has degree S 4. 


Proof. The degree of P satisfying (*) may be reduced by introducing additional 


1973] HILBERT’S TENTH PROBLEM IS UNSOLVABLE 263 


variables z; satisfying equations of the form 


Zj = Vive 
_. 2 
Zi = Vi 
Zz; = XY; 
Zz; = x’, 


By successive substitutions of the z,’s into P its degree can be brought down to 2. 
Hence the equation is equivalent to a system of simultaneous equations each of 
degree 2. Summing the squares gives an equation of degree 4. 

A less trivial (and more surprising) fact is: 


THEOREM 7.6. There is an integer m such that every Diophantine set has 
dimension S m. 


Proof. Write 
D, = {x| (4y1, +) Vn) [ P(X, 1, V15°*'s Ym) = O]}, 


which is possible by the universality theorem. Then the dimension of D, is S$ m for 
all n. 
An interesting example is given by the sequence of Diophantine sets: 


S, = {x | (Ay 15° Va) [x = (V1 + 1)-+(Y, + 1) J}. 


Here S, is the set of composite numbers; S, is the set of *‘q-fold’’ composite numbers. 
It is surely surprising that it is possible to give a Diophantine definition of S, (for 
large q) requiring fewer than q parameters (cf. [19)). 

How large is m, the number of parameters in the universal Diophantine set? 
A direct calculation using the arguments given here would yield a number around 50. 
Actually Matiyacevié and Julia Robinson have very recently shown that m = 14 
will suffice! 

The unsolvability of Hilbert’s tenth problem can be used to obtain a strengthened 
form of Gédel’s famous incompleteness theorem: 


THEOREM 7.7. Corresponding to any given axiomatization of number theory, 
there is a Diophantine equation which has no positive integer solutions, but such 
that this fact cannot be proved within the given axiomatization. 


A rigorous proof would involve a precise definition of ‘‘axiomatization of number 
theory’’ which is outside the scope of this article. An informal heuristic argument 
follows: 

One uses the given axiomatization to systematically generate all of the theorems 
(i.e., consequences of the axioms). Among these theorems will be some asserting 


264 MARTIN DAVIS [March 


that some Diophantine equation has no solution. Whenever such is encountered it 
is placed on a special list called LISTA. At the same time a list, LIST B, is made of 
Diophantine equations which have solutions. LIST B is constructed by a search 
procedure, e.g., at the nth stage of the search look at the first n Diophantine equations 
(in a suitable list) and test for solutions in which each argument is <n. Thus every 
Diophantine equation which has positive integer solutions will eventually be placed in 
LIST B. If likewise each Diophantine equation with no solutions would eventually 
appear in LIST A, then one would have an algorithm for Hilbert’s tenth problem. 
Namely, to test a given equation for possession of a solution simply begin generating 
LIST A and LIST B until the given equation appears in one list or the other. Since 
Hilbert’s tenth problem is unsolvable, some equation with no solution must be 
omitted from LIST A. But this is just the assertion of the theorem. 


8. Recursively enumerable sets. It is now time to settle the question raised at the 
beginning: which sets are Diophantine? 


DEFINITION. 8.1. A set S of n-tuples of positive integers is called recursively 
enumerable if there are recursive functions f(X,X1,°°';X,), 9(X,X1.°°',X,) such that: 


S = {XX 15° Xn> (dx) [A(X X15 77+, Xn) = g(x, X1,°'+,X,) ]}- 


THEOREM 8.1. A set S is Diophantine if and only if it is recursively enumerable. 


Proof. If S is Diophantine there are polynomials P, Q with positive coefficients 
such that: 


«X45 5X, > CS > (Aas '''s Vn) [P(X45°**sXnsVi5°''s Ym) = Q(x, "**sXny Vis '*'s Vn) | 
- (du) [P(x,, ty Xn» S(1, u), “++, S(m, u)) = O(%1,°°',X,,S(1, u), +, S(m, u))], 


so that S is recursively enumerable. 
Conversely if S is recursively enumerable there are recursive functions 
I (X, X14, Xn) g(x, X1, Xn) such that 


«X45 "*\Xa> eS = (Ax) [ F(x, x4, Xn) = g(X,X4, 5 Xq) | 
<> (Ax, z) [z =f(%,%4, °°, X,) &Z = G(x, X41, °°, Xp) J. 
Thus by Theorem 6.1, S is Diophantine. 


9. Historical appendix. The present exposition has ignored the chronological 
order in which the ideas were developed. The first contribution was by Gédel in 
his celebrated 1931 paper [16]. The main point of Gédel’s investigation was the 
existence of undecidable statements in formal systems. The undecidable statements 
Gédel obtained involved recutsive functions, and in order to exhibit the simple 
number-theoretic character of these statements, Gédel used the Chinese remainder 
theorem to reduce them to “‘arithmetic’’ form. The technique used is just what is 


1973] HILBERT’S TENTH PROBLEM IS UNSOLVABLE 265 


used here in proving Theorem 1.3 (the sequence number theorem) and Theorem 6.1 
(in the direction: every recursive function is Diophantine). However without the 
techniques for dealing with bounded universal quaritifiers as discussed in this paper, 
the best result yielded by Gédel’s methods is that every recursive function (and 
indeed every recursively enumerable set) can be defined by a Diophantine equation 
preceded by a finite number of existential and bounded universal quantifiers®. In my 
doctoral dissertation (cf. [5], [6]), | showed that all but one of the bounded universal 
quantifiers could be eliminated, so that every recursively enumerable set S could be 
defined as 


S= {x | (dy) (Wk)<y(Ayi.-", Vm) [P(k, x,y, V1, =". Vn) = O]}. 


This representation became known as the Davis normal form. (Later R. M. Robinson 
[31], [32] showed that in this normal form one could take m = 4. More recently 
Matiyacevié has shown that one can even take m = 2. It is known that one cannot 
always have m = 0; whether one can always get m = | is open.) 

Independent of my work and at about the same time, Julia Robinson began her 
study [27] of Diophantine sets. Her investigations centered about the question: 
Is the exponential function Diophantine? The main result was that acertain hypoth- 
esis implied that the exponential function was Diophantine. The hypothesis, which 
became known as the Julia Robinson hypothesis, has played a key role in work on 
Hilbert’s tenth problem. Its statement is simply: 

There exists a Diophantine set D such that: 

(1) <u,v> €D implies v < u". 

(2) For each k, there is (u,v) €D such that v > u'. 

The hypothesis remained an open question for about 2 decades. (Actually the set 


D= {<u, v> | v =x,(2) & u > 3} 


satisfies (1) and (2) by Lemma 2.19 and is Diophantine by Corollary 3.2, so the truth 
of Julia Robinson’s hypothesis follows at once from the results in this article.) 
Julia Robinson’s proof that this hypothesis implies that the exponential function is 
Diophantine used the Pell equation. And, the proof that the exponential function is 
indeed Diophantine given here is closely related to a more recent proof [28] by 
her of this same implication. 

In [27], Julia Robinson studied also sets and functions which were exponential 
Diophantine (or existentially definable in terms of exponentiation) that is which 
possess definitions of the form: 


(Juyy tt, Uns V1. °° Uns W1,°''s W,) [ P(X, 00+, Xin Uys 095 Uns 15 08's Uns W508, Wa) = 0) 
&u, =0,'&- &u, = v,"]. 


In particular, the functions (j,) and n! were shown by her to be exponential 


266 MARTIN DAVIS [March 


Diophantine. This is really what is shown in proving (1) and (2) of Theorem 4.1. 
The present proof of (2) is just hers; the proof of (1) given here is a simplified var- 
iant of that in [27]. (It is due independently to Julia Robinson and Matiyasevié.) 

The idea of using the Chinese remainder theorem to code the effect of a bounded 
universal quantifier first occurred in the work of myself and Putnam [7]. In [8], we 
refined our methods and were able to show, beginning with the Davis normal form, 
that IF there are arbitrarily long arithmetic progressions consisting entirely of 
primes (still an open question), then every recursively enumerable set is exponential 
Diophantine. In our proof we needed to establish that h(a, b, y) = [ [2-1 (a + bk) is 
exponential Diophantine, which we did extending Julia Robinson’s methods. (The 
proof given here of (3) of Theorem 4.1 is a much simplified argument found much 
later by Julia Robinson —cf. [29].) Julia Robinson then showed first how to eliminate 
the hypothesis about primes in arithmetic progression, and then how to greatly 
simplify the proof along the lines of Lemma 5.2 of this article. Thus we obtained the 
theorem of [9] that every recursively enumerable set is exponential Diophantine. 

Attention was now focused on the Julia Robinson hypothesis since it was plain 
that it would imply that Hilbert’s tenth problem was unsolvable. 

Many interesting propositions were found to imply the Julia Robinson hypoth- 
esis.’. However the hypothesis seemed implausible to many, especially because it 
was realized that an immediate and surprising consequence would be the existence of 
an absolute upper bound for the dimensions of Diophantine sets (cf. Theorem 7.6). 
Thus in his review [19] Kreisel said concerning the results of [9]: ‘‘... it is likely the 
present result is not closely connected with Hilbert’s tenth problem. Also it is not 
altogether plausible that all (ordinary) Diophantine problems are uniformly reducible 
to those in a fixed number of variables of fixed degree... .”’ 

The Julia Robinson hypothesis was finally proved by Matiyasevié [23], [24]. 
Specifically he showed that if we define 


d, =a, =1, Ant1 =n Fan 


so that a, is the nth Fibonacci number, then the function a,, is diophantine. Then 
since, for n = 3, as is easily seen by induction, 


(7) <a,< 2", 


the set 
D = {u,v>| v = ap, &u = 2} 


satisfies the Julia Robinson hypothesis. Subsequently, direct diophantine definitions 
of the exponential function were given by a number of investigators, several of 
them using the Pell equation as in this article (cf. [3], [4], [14], [18a]). The treatment 
in §2, 3 is based on Matiyasevié’s methods, although the details are Julia Robinson’s. 


1973] HILBERT’S TENTH PROBLEM IS UNSOLVABLE 267 


In particular, it was Matiyasevié who taught us how to use results like Lemmas 2.11, 
2.12, and 2.22 of the present exposition. (Matiyasevié himself used analogous results 
for the Fibonacci numbers.) 

It was soon noticed (by S. Kochen) that by a simple inductive argument the use 
of the Davis normal form could now be entirely avoided, as has been done in the 
present exposition. 

Let #(P) be the number of solutions of the Diophantine equation P = 0. Thus 
0< #(P) So. Hilbert’s tenth problem seeks an algorithm for deciding of a given 
P whether or not #(P) =0. But there are many related questions: Is there an 
algorithm for testing whether #(P) =o, or #(P) = 1, or #(P) is even? I was able 
to show easily (beginning with the unsolvability of Hilbert’s tenth problem) that all 
of these problems are unsolvable. In fact if 


A = {0,1, 2,3, ---No} 


and B< A, B 4 @, B # A, then one can readily show that there is no algorithm for 
determining whether or not #(P)€éB (cf. [15)}). 

The fact that no general algorithm such as Hilbert demanded will be forthcoming 
adds to the interest of algorithms for dealing with special classes of Diophantine 
equations. Alan Baker and his coworkers [1], [2] have in recent years made con- 
siderable progress in this direction. 


Notes 


1. These pairing functions (but of course not their being Diophantine) were used by Cantor in 
his proof of the countability of the rational numbers. J. Roberts and D. Siefkes each corrected an 
error in the definition of these functions. They, as well as W. Emerson, M. Hausner, Y. Matiyasevié, 
and Julia Robinson made helpful suggestions. 

2. For example, cf. [25], pp. 175-180. Matiyasevie used instead the equations x2 — xy — y2 = 1, 
u2— muy + v2 = 1. 

3. The recursive functions are usually defined on the nonnegative integers. This creates a minor 
but annoying technical problem in comparing the present definition with one in the literature (e.g., 
cf. [6], p. 41; also Theorem 4.2 on p. 51). Thus one can simply note that f (x1,---, x,) is recursive in 
the present sense if and only if f(t; + 1,---,4, + 1)—1 is recursive in the usual sense. From the 
point of view of the intuitive ““computability” of the functions involved this doesn’t matter at all; 
one is simply in the position of using the positive integers as a “code” for the nonnegative integers — 
using n + 1 to represent zn. 

4. Inclusion of S (i, u) in this list is redundant. That is, S (i, uv) can b2 obtained using our three 
operations from the remaining initial functions. 

5. The method of proof is Julia Robinson’s, [28], [30]. If one were permitted to use the enumera- 
tion theorem in recursive function theory (([6], p. 67. Theorem 1.4), the Universality Theorem would 
follow at once from Theorem 6.1. 

6. Actually the result which Gédel stated (as opposed to what can be obtained at once by use of 
his techniques) was somewhat weaker. Indeed, the very definition of the class of recursive functions 
and the perception of their significance came several years later in the work of Gédel, Church, and 
Turing. In particular the suggestion that recursiveness was a precise equivalent of the intuitive 


268 MARTIN DAVIS [March 


notion of being computable by an explicit algorithm was made independently by Church and by 
Turing. And of course it is this identification which is essential in regarding the technical results 
discussed in this account as constituting a negative solution of Hilbert’s tenth problem. (For further 
discussion and references, cf. [6].) 

7. For example, I showed ([13]) that the Julia Robinson hypothesis would follow from the non- 
existence of nontrivial solutions of the equation 


9 (u2 -+ Ty2)2 ~— 7(x2 + Ty2)2 = 2. 


The methods used readily show that the same conclusion follows if the equation has only finitely 
many solutions. Cudnovskii [4] claims to have proved that 2* is diophantine (and hence the Julia 
Robinson hypothesis) using this equation. Apparently there is a possibility that some of Cudnovskii’s 
work may have been done independently of Matiyasevi¢ — but I have not been able to obtain definite 
information about this. 


References 


1. Alan Baker, Contributions to the theory of Diophantine equations: I. On the representation 
of integers by binary forms, II. The Diophantine equation y2 = x3 + k, Philos. Trans. Roy. Soc. 
London Ser. A, 263 (1968) 173-208. 

2. Alan Baker, The Diophantine equation y2 = ax3 + bx2 -+ ex -+ d, J. London Math. Soc., 43 
(1968) 1-9. 

3. G. V. Cudnovskii, Diophantine predicates (Russian), Uspehi Mat. Nauk, 25 (1970) no. 4 
(154), pp. 185-186. 

4, , Certain arithmetic problems (Russian), Ordena Lenina Akad. Ukrains. SSR, Preprint 
IM-71-3. 

5. Martin Davis, Arithmetical problems and recursively enumerable predicates, J. Symbolic 
Logic, 18 (1953) 33-41. 

6. , Computability and Unsolvability, McGraw Hill, New York, 1958. 

7. Martin Davis and Hilary Putnam, Reduction of Hilbert’s tenth problem, J. Symbolic Logic, 
23 (1958) 183-187. 

8. ———- and ———, On Hilbert’s tenth problem, U.S. Air Force O. S. R. Report AFOSR TR 
59-124 (1959), Part IIT. 

9, Martin Davis, Hilary Putnam, and Julia Robinson, The decision problem for exponential 
Diophantine equations, Ann. Math., 74 (1961) 425-436. 

10. Martin Davis, Applications of recursive function theory to number theory, Proc. Symp. 
Pure Math., 5 (1962) 135-138. 

11. ————,, Extensions and corollaries of recent work on Hilbert’s tenth problem, [linois J. 
Math., 7 (1963) 246-250. 

12. Martin Davis and Hilary Putnam, Diophantine sets over polynomial rings, Illinois J. Math., 
7 (1963) 251-256. 

13. Martin Davis, One equation to rule them all, Trans. New York Acad. Sci., Series II, 30 
(1968) 766-773. 

14. , An explicit Diophantine definition of the exponential function, Comm. Pure Appl. 
Math. 24 (1971) 137-145. 

15. , On the number of solutions of Diophantine equations, Proc. Amer. Math. Soc., 
35 (1972) 552-554. 

16. Kurt Gédel, Uber formal unentscheidbare Satze der Principia Mathematica und verwandter 
Systeme I, Monatsh. Math. und Physik, 38 (1931) 173-198. English translations: (1) Kurt, Gédel, 
On Formally Undecidable Propositions of Principia Mathematica and Related Systems, Basic 


1973] HILBERT’S TENTH PROBLEM IS UNSOLVABLE 269 


Books, 1962. (2) Martin Davis (editor), The Undecidable, Raven Press, 1965, pp. 5-38. (3) Jean 
Van Heijenoort (editor), From Frege to Gédel, Harvard University Press, 1967, pp. 596-616. 

17. G. H. Hardy and E. M. Wright, An Introduction to the Theory of Numbers, Fourth edition, 
Oxford University Press, 1960. 

18. David Hilbert, Mathematichse Probleme, Vortrag, gehalten auf dem internationalen Mathe- 
matiker-Kongress zu Paris 1900. Nachrichten Akad. Wiss. Géttingen, Math. -Phys. KI. (1900) 
253-297. English translation: Bull. Amer. Math. Soc., 8 (1901-1902) 437-479. 

18a. N. K. Kosovskii, On Diophantine representations of the solutions of Pell’s equation (Rus- 
sian), Zap. Nauén. Sem. Leningrad. Otdel. Mat. Inst. Steklova, 20 (1971) 49-59. 

19. Georg Kreisel, Review of [9]. Mathematical Reviews, 24 (1962) Part A, p. 573 (review number 
A 3061). 

20. Yuri Matiyasevié, The relation of systems of equations in words and their lengths to Hilbert’s 
tenth problem (Russian). Issledovaniya po Konstruktivnoi Matematike i Matematiceskoi Logike 
II. Vol. 8, pp. 132-144. 

21. ———_—,, Two reductions of Hilbert’s tenth problem (Russian), Ibid., pp. 145-158. 

22. , Arithmetic representation of exponentiation (Russian). Ibid., pp. 159-165. 

23. , Enumerable sets are Diophantine (Russian), Dokl. Akad. Nauk SSSR, 191 (1970) 
279-282. Improved English translation: Soviet Math. Doklady, 11 (1970) 354-357. 

23a. , Diophantine representation of the set of prime numbers (Russian). Dokl. Akad. 
Nauk SSSR, 196 (1971) 770-773. Improved English translation with Addendum: Soviet Math. 
Doklady, 12 (1971) 249-254. 

23b. , Diophantine representation of recursively enumerable predicates, Proc. Second 
Scandinavian Logic Symp., editor, J. E. Fenstad, North-Holland, Amsterdam, 1971. 

23¢. , Diophantine representation of recursively enumerable predicates, Proc. 1970 
Intern. Congress Math., pp. 234-238. 

24. , Diophantine representation of enumerable predicates (Russian), Izv. Akad. Nauk 
SSSR, Ser. Mat. 35 (1971) 3-30. 

24a. , Diophantine sets (Russian), Uspehi Mat. Nauk, 27(1972) 185-222. 

25. Ivan Niven and Herbert Zuckerman, An Introduction to the Theory of Numbers, 2nd ed., 
Wiley, New York, 1966. 

26. Hilary Putnam, An unsolvable problem in number theory, J. Symb. Logic, 25 (1960) 220-232. 

27. Julia Robinson, Existential definability in arithmetic, Trans. Amer. Math. Soc., 72 (1952) 
437-449. 

28. , Diophantine decision problems, MMA Studies in Mathematics, 6 (1969) [Studies 
in Number Theory, edited by W. J. LeVeque, pp. 76-116]. 

29. , Unsolvable Diophantine problems, Proc. Amer. Math. Soc., 22 (1969) 534-538. 

30. , Hilbert’s tenth problem, Proc. Symp. Pure Math., 20 (1969) 191-194. 

31. Raphael M. Robinson, Arithmetical representation of recursively enumerable sets, J. Symb. 
Logic, 21 (1956) 162-186. 

32. , some representations of Diophantine sets, J. Symb. Logic, forthcoming. 


HISTORY OF THE RIEMANN MAPPING THEOREM 
J. L. WALSH, University of Maryland 


The Riemann mapping theorem, that an arbitrary simply connected region 
of the plane can be mapped one-to-one and conformally onto a circle, first appeared 
in the Inaugural dissertation of Riemann (1826-1866) in 1851. The theorem is im- 
portant, for by it a result proved for the circle can often be transformed from the 
circle to a more general region. The proof is difficult, as involving both behavior 
of a function in the small (conformal mapping) and behavior in the large (one-to- 
one mapping). Riemann’s proof was open to criticism and in the following decades 
numerous mathematicians sought for a proof, e.g., H. A. Schwarz (1843-1921), 
A. Harnack (1851-1888), H. Poincaré (1854-1912), etc., until the first rigorous 
proof was given in 1900 by W. F. Osgood. The proof of Osgood represented, in my 
opinion, the ‘‘coming of age’’ of mathematics in America. Until then, numerous 
American mathematicians had gone to Europe for their doctorates, or for other 
advanced study, as indeed did Osgood. But the mathematical productivity in this 
country in quality lagged behind that of Europe, and no American before 1900 
had reached the heights that Osgood then reached. 

William Fogg Osgood (1864-1943) was born in Boston in 1864, graduated from 
Harvard College in 1886, stayed in Cambridge for a year of graduate work, and 
then went to Gdttingen with a Harvard fellowship for further study, especially 
with Felix Klein (1849-1925). According to gossip, Osgood became so enamored 
of a Gottingen lady that his work suffered and Klein sent him to Erlangen for his 
doctorate. In any case, he was accorded the degree from Erlangen in 1890 for a thesis 
on Abelian integrals, and one or two days later he married the girl in Géttingen, 
and one or two days still later they sailed for the United States of America. His 


Professor Walshreceived his Harvard Ph. D. under Maxime Bécher and George David Birkhoff. 
He continued at Harvard as Instructor through Perkins, Professor of Mathematics and became 
Professor Emeritus in 1966; since then he has been at the Univ. of Maryland. He has spent leaves of 
absence at the Sorbonne, the Univ. of Munich, the Institute for Advanced Study, and has spent 
several sabbatical leaves in Paris and Jerusalem. 

He is a Fellow of the American Academy of Arts and Sciences and a Member of the National 
Academy of Sciences. Both the SIAM Journal on Numerical Analysis and the Journal of Approxim- 
ation Theory have dedicated volumes to Joseph Walsh. His main research is on zeros, extremal 
problems, and approximations by polynomials and orthogonal functions. He is widely known for 
his invention of the Walsh functions. 

His publications include Interpolation and Approximation (Amer. Math. Soc. Coll. Series, 1935, 
Russian Tranlation — 1961), Location of Critical Points of Analytic and Harmonic Functions (Amer. 
Math. Soc. Coll. Series, 1950), Approximation by Polynomials (Paris, 1935), Approximation by Bound- 
ed Analytic Functions (Paris 1960). The Theory of Splines and their Application (with J. H. Ahlberg 
and E. N. Nilson, Academic Press, New York, 1967), A Bibliography on Orthogonal Polynomials 
(with J. S. Shohat and Einar Hille, National Research Council, Bulletin, Washington, D. C. 1940), 
and A Rigorous Treatment of Maximum-Minimum Problems in the Calculus (Heath, 1962). Editor. 


270 


HISTORY OF THE RIEMANN MAPPING THEOREM 271 


early mathematical work was also of high quality. During the 1890’s he was Lebes- 
gue’s forerunner in the study of sequences of functions of a real variable. Osgood 
taught at Harvard from 1890 until his retirement in 1933. 

Osgood seems not to have received the recognition for his work that he deserves. 
For instance, C. Carathéodory and G. Julia each wrote a book on conformal mapping 
without mention of the name of Osgood. 

We proceed now with the proof of Riemann’s theorem! 

By a simply connected region Riemann understood a region bounded by a simple 
closed curve, and before him special mappings by simple functions were well known. 
We assume the given region to be bounded, which may require an elementary pre- 
liminary transformation. Let us examine Riemann’s proof (based on Dirichlet’s 
Principle) and postpone discussion of its validity. 

Mapping of a region T onto a circle is equivalent to the existence of Green’s 
function for T, namely a function G(z) such that 

(1) G(z) is harmonic in T except at the origin 0, assumed interior to T; 

(2) in the neighborhood of 0 the function takes the form G(z) = G,(z) + logr, 
where r = | z| and G,(z) is harmonic throughout T; 

(3) G(z) is continuous and equal to zero at every point of the boundary C of T. 

These three conditions determine G(z) uniquely. Green’s function for a region T 
is invariant under one-to-one conformal mapping of T. 

If the function w = ¢(z) maps T (Figure 1) onto | w| <1 so that #(0) = 0, 
then we clearly have 


o(z) = oG(z) +iH(2) 


where H(z) is conjugate to G(z) in T, for each of the conditions (1), (2), (3), is satis- 
fied by G(z) as thus defined. Conversely, if G(z) is Green’s function for T with pole 
in 0, then every point of Tis transformed by w = @(z) into a point | w | <1. Each 
locus L,: | ¢(z)| =r, 0<r<i in T bounds two subregions of T, where 
G(z) > logr and G(z) < logr respectively; the locus L, has no multiple points and 


w-pl, 


z-pl. 


Fic. 1 


272 J. L. WALSH [March 


separates 0 and C. On L, we have 0G/dn # 0, where nis the interior normal for the 


latter subregion, whence 
0G OH Ologr 
i ln a" ~ I. On ds = 2n, 
so the transformation w = ¢(z) defines a one-to-one map of T onto | w| <1. 

If T is given, the determination of G(z) requiféS the solution of the Dirichlet 
problem for T with the prescribed boundary values logr on C, a problem that 
Riemann treated by means of Dirichlet’s principle. The physical evidence for the 
existence of G(z) is great, for in the steady two-dimensional flow of heat, the temper- 
ature is a harmonic function provided T is a uniform body whose continuous 
boundary temperatures on C are prescribed. 

The Dirichlet integral defined for a function u(x, y) given in a region T is de- 


fined as 
v0 = (8) + Qa ao 


We compare this integral with the corresponding integral where u(x, y) is replaced 
by u(x, y) +6: v(x, y), where v(x, y) vanishes on the boundary C of T. Thus we 
have, to study the function u(x, y) with given boundary values minimizing D(u), 


nos fl) + (BY esr oalf (Se eB B) ew 
+e ff (2) (B)ow 


Considered as a function of e, this second term on the right must be zero, namely, 


I a x (> 3 x) +3 (° 5) | dxdy - |) ev u dx dy =0 


for all choices of the arbitrary function v. The former of these two integrals reduces 
to two contour integrals over C with v (=0 on C) as a factor of the integrand. Thus 
for the function u minimizing D(u), y? u = 0 throughout T, and u is harmonic in T. 
‘*The function solving the boundary value problem is the function minimizing D(u).”’ 

This ‘‘proof,’’ although accepted by Riemann, is obviously open to various 
objections: 

(1) The treatment has a meaning only if C has certain properties of smoothness 
and differentiability. 

(2) The fact that D(u) has a non-negative greatest lower bound does not show the 
existence of a minimum (Weierstrass). 


1973] HISTORY OF THE RIEMANN MAPPING THEOREM 273 


(3) The fact that D(u) < oo for some u(x, y) satisfying the given boundary values 
needs to be shown (Prym 1871, Hadamard 1906). 

It is convenient to assume that T i is bounded; if not, we may use the transformation 
w= J/@- (z-—a)/(z- f), where « and B are two distinct boundary points of T. Then 
T in the z-plane corresponds to two regions T; and T, on the w-sphere, one-to-one 
conformal images of T, which have no common point. If two such regions‘do not 
exist, a point w, in T, can be joined to a point w, by a path in T, separating w = 0 
and w = 00, so there is a closed curve in T separating « and f, and T is not simply 
connected. Inversion of a point of fr to infinity now maps T, onto a bounded 


region. 
We mention here several results that we shall need for discussion of Osgood’s 


proof. 

(1) Axel Harnack’s Theorem (1887). If a function u, is harmonic ina region T for 
all sufficiently large values of n, and if u, increases at all paints of T when n increases ; 
if furthermore at a single point of T u, approaches a (finite) limit when n becomes 
infinite; then u, converges at all points of T, to a function harmonic throughout T. 
(It is reported that when Harnack first told Felix Klein of this theorem, the latter 


refused to believe its validity.) 

(2) H. A. Schwarz. Green’s function exists for a simply connected region T bounded 
by a finite number of analytic arcs. (Schwarz used the alternating method, due to C. 
Neumann.) 

(3) Lemma. If the bounded region 7 contains the closure of the region T,, and if 
O lies in T,, then the respective Green’s functions g and g, with poles in O for T and 
T, satisfy the inequality g > g, >0 in T,. For the difference g — g, is harmonic in 
T,, and g, =0, g >g;, on the boundary of T,, whence g —g, >0, g-—g, £0, 
throughout T;. 
) z-pl. 


Fia. 2 


274 J. L. WALSH [March 


(4) Given a region T,, it can be exhausted by a monotonic sequence of subregions, 
composed for instance of adjacent squares whose sides are parallel to the coordinate 
axes. 


w-pl. 


Z| 
T; 0 ¥ Wo 
image of 
Py 
é 
Rew = logR 


Fic. 3 


Given, now, (Figure 2) a bounded simply connected region T, of the z-plane, we 
show that T, can be transformed into a region T of the w-plane in such a manner 
that a given boundary point P, of T, corresponds to a point P of a circle T which 
contains T. Let T, be considered to lie on the Riemann surface for w = log z with 
P, at z=0. The image T, in the w-plane of T, consists (Figure 3) of an infinite 
number of images of T,, each the translation of another such region by the vector 


Z @ W 
P,| o| P 
T, | T, | T 

Mo CO 
Q')0, 


Fic. 4 


1973] HISTORY OF THE RIEMANN MAPPING THEOREM 275 


@ = +2ni. In each such region the point at infinity @ = oo corresponds to P,, for 
all boundary points of T, in the region | z| <é correspond to points w with Rew 
< log «. Let | z| < R be the smallest circular disk with center P, containing T,; 
then all points of T, lie in the half-plane Rew <logR. A linear transformation 
carrying to infinity in the w-plane a finite point wp), Re Wp > log R; carries T, into a 
region T of the w-plane (Figure 4) which lies in a circular disk D (image of Rew 
< log R) whose boundary passes through the image P of P,. 

It may be noted too that an arbitrary point Q of T, with Im Q =Im @, can be 
chosen so that Q is simultaneously carried into O,, in the w-plane. The point O,, in 
T is then the center of D. 

Let D,, be a monotonic sequence of subregions of T containing O,,, and each with 
a Green’s function g, with pole in O,, with D,,, > D,, exhausting T. Let gy be 
Green’s function for D with pole in O,; then g, <gpo in D,. Let g =lim,.,, 9, 
defined (Harnack) and harmonic throughout T except in 0. Then g is Green’s 
function for T; we have 0 < g, < go, 0< g S Qo. Suppose P,¢ T, P, — P. Since 


lim go(P,) = 0 for P,€D, 


Px7P 
we also have 


lim g(P,,) = 0, g(P) = 0, 

P.-7>P 
and this shows the existence of Green’s function for T and thus completes Osgood’s 
proof of Riemann’s theorem. 


We have not mentioned the work of Hilbert (1862-1943), who gave a treatment of 
Riemann’s theorem in weakened form by new methods of the Calculus of Variations, 
commencing about 1900. This general problem in the Calculus of Variations was 
presented as Problem 20 in his famous Paris lecture of 1900. He suggested in particular 
thesis topics on the subject for several American doctoral students in Gdéttingen: 
C. A. Noble, E. R. Hedrick, and Max Mason. However, Hilbert’s method required 
certain smoothness properties of the boundary and of the limit function, and was 
thus less general than the idea of the original Dirichlet principle and less general 
than Osgood’s proof. A new method of proof, based on function theoretic rather 
than potential theoretic properties, was developed by F. Riesz and L. Fejér, published 
in 1923 by T. Rado. Montel’s theory of normal families was used, and a lemma due 
to Koebe. This is the standard modern proof. 


Research supported in part by U.S. Air Force of Scientific Research, Grant AF 69-1690. 


References 


1. B. Riemann, Grundlagen fiir eine allgemeine Theorie der Funktionen einer veranderlichen 
complexen Grdésse. Inauguraldissertation, Gdttingen, 1851. 


276 S. GREITZER [March 


2. A. Harnack, Logarithmisches Potential, Leipzig, 1887, p. 167. 

3. H. A. Schwarz, Zur Integration der partiellen Differentialgleichung Au = 0, Ges. Math. 
Abhandlungen vol. 11. pp. 144-171. See also Osgood, Funktionentheorie, Fifth Ed. Leipzig 1928, 
Ch. 14§ 4. 

4. Henri Poincaré, Sur un théoréme de la théorie générale des fonctions, Bull. Soc. Math. 
France, 11 (1883) 112-125. 

5. , sur ’uniformisation des fonctions analytiques, Acta Mathematica, 31 (1907) 1-63. 

6. W. F. Osgood, On the existence of Green’s function for the most general simply connected 
plane region, Trans. Amer. Math. Soc., 1 (1900) 310-314. 

7. , Funktionentheorie, Fifth Ed. Leipzig 1928, Ch. 14 §5. 

8. D. Hilbert, Uber das Dirichletsche Prinzip, Jber. Deutsch. Math-Verein., 8 (1900) 184-188. 

9. , Uber das Dirichletsche Prinzip, Math. Annalen, 59 (1904) 161-186. 


THE FIRST U. S. A. MATHEMATICAL OLYMPIAD 
S. GREITZER, Rutgers -— The State University 


At its meeting on September 1, 1971, the Mathematical Association of America 
agreed to sponsor a U. S. A. Mathematical Olympiad in addition to the Annual 
High School Mathematics Examination. The purpose of the Olympiad was to attempt 
to discover secondary school students with superior mathematical talent — who 
possessed mathematical creativity and inventiveness as well as competence in com- 
putational techniques. Participation was to be limited to about 100 students selected 
from the Honor Roll on the High School Mathematics Examination, plus a few 
students of superior ability selected from those states which did not participate in 
the High School Mathematics Examination. The Olympiad itself was to consist of 
five essay-type problems requiring mathematical power on the part of the partici- 
pants. The committee responsible for conducting the Olympiad consisted of Samuel 
L. Greitzer, Rutgers University, Alfred Kalfus, Babylon High School, Murray 
S. Klamkin, Ford Motor Company, and Nura D. Turner, SUNY at Albany. 

Invitations were sent to 106 students on April 14, 1972, and 100 students took 
the Olympiad on May 9, 1972. The committee which prepared the Olympiad 
consisted of Murray Klamkin, D. J. Newman, Yeshiva University and Abraham 
Schwartz, CUNY. The Olympiad is reproduced below. (Solutions have been provided 
at the end of this article.) 


THE FIRST U. S. A. MATHEMATICAL OLYMPIAD 
May 9, 1972 


1. The symbols (a, b, ..., g) and [a, b, ..., g] denote the greatest common divisor 
and the least common multiple, respectively, of the positive integers a, b, ...g. For 


1973] THE FIRST USA MATHEMATICAL OLYMPIAD 277 
example, (3, 6, 18) = 3 and [6, 15] = 30. Prove that 


[ab cl* bc)? 
[a,b] [b,c] [c, a] ~ (a, b)(b, c)(c, a) * 

2. A given tetrahedron ABCD is isosceles, that is, AB = CD, AC = BD, AD 
= BC. Show that the faces of the tetrahedron are acute-angled triangles. 

3. A random number selector can only select one of the nine integers 1, 2,...,9, 
and it makes these selections with equal probability. Determine the probability that 
after n selections (n > 1), the product of the n numbers selected will be divisible by 
10. 

4. Let R denote a nonnegative rational number. Determine a fixed set of integers 
a, b, c, d, e, f such that, for every choice of R, 

2 — 
dR2+eR+f 

5. A given convex pentagon ABCDE has the property that the area of each of 
the five triangles ABC, BCD, CDE, DEA and EAB is unity. Show that every non- 
congruent pentagon with the above property has the same area, and that, further- 
more, there are an infinite number of such noncongruent pentagons. 


< | R- 3 


All 100 participants mailed in their papers by May 15, and they were graded by a 
committee consisting of Dr. John Bender, Dr. Richard Bumby, Dr. L. M. Kelly, 
Dr. Sol Leader,.Dr. B. Muckenhoupt and Dr. H. Zimmerberg. We are indebted to 
the mathematics department of Rutgers University for their help in grading. The 
final results were sent out to the participating schools on June 1, 1972. These results 
are tabulated below: 


H.S.Exam 85—  90— 95— |100— 105— 110— 115— | 120— 125 
Olymb 89.75 94.75 99.75 | 104.75 109.75 114.75 119.75 124.75 129.7 


90-99 


g0-39 1 1 
70-79 : 1 3 1 
1 


60-69 1 

50-59 6 2 6 1 i | 
40-49 7 1 i 1 1 

30-39 2 5 4 1 1 1 1 

20-299... °® #©385 4 EO 


10-19 5 1 1 3 
0-9 6 3 2 1 1 


The eight finalists selected to receive awards had scores indicated in the rectangle at 
the upper part of the table. 


278 S. GREITZER [March 


While it is not safe to draw conclusions on the basis of so small a sample, it would 
appear that (a) a high score on the High School Mathematics Examination did not 
necessarily lead to a high score on the Olympiad. In fact, the correlation is only 0.24. 
We can conclude that the Olympiad is measuring something other than does the © 
High School Mathematics Examination. However, (b) it must be noted that all 
eight finalists did score above 100 on the High School Mathematics Examination. 
We may conclude that students rated superior on the Olympiad are also superior 
on the High School Mathematics Examination, but that the converse is not necessarily 
true. 

Interest in the Olympiad was high. Members of the Olympiad Committee re- 
ceived numerous calls and letters asking for further information, offers to help, and 
pleas to be allowed to take part. On May 8, we received calls from two schools that 
had not received the contest materials. In both cases, the difficulty originated with 
the school. It was too late, however, to mail anything to the schools. Therefore, Mr. 
John Clark drove to Rutgers from Garden City, New York, about 75 miles each way, 
and Mr. Kazlouskas drove to Rutgers from Binghamton, New York, about 200 
miles each way, and picked up the Olympiad materials personally. Both supervisors 
are to be commended for their interest in their students. Incidentally, the student 
from Garden City was tied for second place. Plans had originally called for special 
recognition for the top five contestants. Because there were ties for second, third 
and fourth places, the number of finalists was increased to eight. The list of finalists 
is given below: 


1 James Saxe Albany High School Albany, N. Y. 
Thomas Hemphill James Monroe High School Sepulveda, Calif. 
David Vanderbilt Garden City High School Garden City, N. Y. 
Paul Harrington Paul V. Moore H. S. Central Square, N. Y. 
Arthur Rubin West Lafayette H. S. West Lafayette, Ind. - 
David Anick Ranney School New Shrewsbury, N. J. 
Steven Raher Central High School Sioux City, lowa 

5 James Shearer Livermore High School Livermore, Calif. 


The problem of providing suitable recognition for these very talented students 
was solved for the Committee through the generosity of International Business 
Machines, Inc., which agreed to supply funds for a ceremony. Dr. Nura Turner was 
in charge of this ceremony, which consisted of two days of activities in Washington, 
D. C. All eight contestants were present, some with parents (traveling at their own 
expense and paying for their own accommodations). 

On Tuesday, September 12, there was a reception at the National Academy of 
Sciences, where the finalists were given awards presented by the Olympiad Committee 


1973] THE FIRST USA MATHEMATICAL OLYMPIAD 279 


and the NCTM. Dr. Emanuel Piore spoke, congratulating the finalists and discus- 
sing the roles of mathematicians in science and mathematics. This was followed bya 
dinner at the Department of State. 

On Wednesday, September 13, the group toured the White House, and were 
greeted by Dr. Edward E. David, Science Advisor to the President. They then visited 
the National Bureau of Standards, had lunch there, after a greeting on behalf of 
Dr. Burton Colvin. In the afternoon, they listened to lectures on mathematical 
applications by Dr. M. Newman, Dr. Wesley Nicholson, Dr. R. A. Kirsch and Dr. 
A. J. Goldman. 

The Committee acknowledges with thanks the help provided by these organiza- 
tions, and hopes they will be equally helpful to future Olympiad finalists. 

The Olympiad Committee: 

S. Greitzer (chairman) 
A. Kalfus 

M. Klamkin 

N. Turner 


Solutions to Problems 


1. Let a = Ip", b = IIpj', c = IIp{'where p, denote the prime factors of a, b, c 
(some of the exponents may be zero). 
- Since [a,b] = Tpm**”? (a, b) = Tpr'"*"” etc., we have to show that 


2 max(a,, b;,c;) — max(a;, b;) — max(b;, c;) — max(c;, a;) 
= 2min(q,, b;,c,;) — min(q;, b;) — min(b;,c,) — min(c;,a;). 


Without loss of generality, let a; 2 b; 2 c; for any particular index i. Then 
2a; — a; — b; — a; = 2c; — b; — c; — c¢;. (One contestant gave analogous results for 
four and five numbers.) 


2. It follows immediately that the sum of the face angles at any vertex is 180°. 
Since the sum of any two angles of a trihedral angle is greater than the third, each 


A 


280 S. GREITZER [March 


angle must be acute. Again (see diagram) assume that ABDC is nonacute. Let M 
be the midpoint of BC. Since AABC ~ ADCB, AM = DM. Now consider a circle 
with M as center and BC as a diameter. Since x BDC is nonacute, C must lie within 
or on the circle. Thus,2 DM < BC. By the triangle inequality on AAMD, AM + MD 
> AD. But since AM = MD and AD = BC, we obtain a contradiction. Therefore, 
x BDC is acute. 


3. In order for the product to be divisible by 10, there must be at least one 5 
and at least one even number among the n numbers that have come up. Let A denote 
the event of obtaining at least one 5 and let B denote the event of obtaining at least 
one even number in n spins. If A+B, AB, A’, P(E) denote the union of a; b; 
the intersection of A,B; the complement of A; the probability of the event E, res- 
pectively, we have P(AB) = 1 — P(A’) — P(B’) + P(A’B’). Hence 


ray =1-()- QGP 


4. Given the inequality, we wish to determine fixed a, b, c, d, e, f to satisfy it, 
for all nonnegative rational R. As R > /2 through a sequence of rational numbers, 
the right hand side of this inequality approaches zero. Consequently, the left hand 
side must vanish if we set R = 3/2. Therefore, 


a.34b5.2F4¢ = W+e.2F4+f. 2". 


It is then necessary that a = e, b =f, c = 2d. On substituting back into the in- 
equality and factoring out the common factor R — 2/2 from both sides, we obtain, 


aR +b—d.2"/3(R + 24/9) 


dR? +aR+b < 


For the last inequality to be satisfied, it suffices to let a, b, d be positive integers and 
make the numerator nonnegative, i.e., by letting a > d.2'/*, b > d.4'/°.. Two simple 
choices aed =1,a=b=2,andd=3,a=4,b=5 leading respectively to 


2R* +2R+4+2 and 4R* +5R+6 
R?2+2R+2 3R2+4R+5° 


It is to be noted that the second approximation is a better one than is the first. 


5. Since AEDC = ABDC = 1, both triangles have equal altitudes to side CD. 
Thus DC || EB. Similarly the other diagonals are parallel to their respective op- 
posite sides. Whence, ABPE is a parallelogram and APEB=1. Letting 
AEDP = y = APBC and APDC = x, we then have x+y = 1. Also: 


AEPD DP _ ADPC y 


AEPB PB ACPB ” 1 


<1 


1973] CORRECTION TO ““WHAT IS A RECIPROCITY LAW ?’’ 281 


Thus y>+y—1=0 and y=(/5-—1)/2. Area (ABCDE) =2-—y-—x-y = 
(5 +./5)/2. 


To prove that there is an infinite number of such noncongruent pentagons, 
construct an arbitrary triangle PDC whose area is x. Extend CP to E and DP to B, 
so that AEDC = ABDC = 1. Now draw EA|| BD and AB|| EC. It follows from 
the previous analysis that the pentagon has the desired property. 

Another proof of the last part can be gotten easily by parallel projection ofa 
regular pentagon with the desired property, i.e., letx’ = mx, y’=y/m(m arbitrary). 
Under this transformation, areas are preserved. (One contestant showed that these 
projections constitute ali the solutions to the problem.) 


CORRECTION TO “WHAT IS A RECIPROCITY LAW?’’ 
This MONTHLY, 79:571-586 (June~July, 1972) 
B. F. WyMan, Ohio State University 
I wish to thank Karl A. Beres and Lawrence J. Dickson for pointing out that the 


theorem of Section 6 (p. 583) is incorrect. Professor Dickson has derived correct 
formulas for the v,, and py; (defined on p. 582): 


(1) Vn = a (j,m)u;, n = deg (f). 


[x/j] 
(2) Hj = 7 Lin jeer) u(m)/ PCT) jem > 


where ¢ and yp are the Euler and Mobius functions. 
Since the theorem referred to was not used in the preparation of the computer 
program mentioned in Section 7, the numerical results reported there are unaffected. 


MATHEMATICAL NOTES 
EDITED BY ROBERT GILMER 


The present backlog for this Department is substantial. Until further notice, new manuscripts 
cannot be accepted. This moratorium will probably continue until June 1, 1973; authors are 
requested to hold their manuscripts pending a further announcement. 


ON ELEMENTARY PROOFS OF PEANO’S EXISTENCE THEOREMS 
JOHANN WALTER, Technische Hochschule, 51 Aachen, FRG. 


1. Historical and didactical remarks. The theory of ordinary differential equations 
is an important building stone in the curriculum of every student of applied 
mathematics. The fundamental problems arising in this theory have attracted and 
deserve persevering attention, particularly from the didactical point of view. (For a 
recent discussion of these questions cf. [14].) The nonexistence even for very familiar 
differential equations — like certain Sturm-Liouville or Riccati equations — of ‘‘closed 
formulas’’ representing the solutions of these equations in terms of ‘‘elementary 
functions’’ is one of the most puzzling phenomena we are confronted with in this 
part of analysis. It is by this phenomenon that we are forced to look for existence 
theorems of a general kind. The most important of these existence theorems as 
regards the weakness of its assumptions as well as the conceptual simplicity of the 
idea underlying its proof is the existence theorem of Peano. The didactical require- 
ment thus arises to have a proof as simple and as elementary as possible of this 
theorem or of a significant special case of it. 

In this connection Kennedy [6] has proposed as a ‘‘Research Problem’’ the 
question ‘‘Is there an elementary proof of Peano’s existence theorem for first order 
differential equations?’’ Before entering into a more detailed discussion of this 
Research Problem we have to clear up two terminological questions. 

1.) Kennedy calls attention to two papers [9], [10] (reprinted in [11], p. 74 and 
119) of Peano. In these papers Peano formulates two existence theorems differing 
from each other. Consider the ordinary differential equation 


x’ = f(t, x), 


where x and f belong to Euclidean d-space R‘ and t is real. Let K be a positive 
number, J = [0,1], f continuous in J x R* and | f(t, x)| < K for (t,x) EJ x R‘%. 


THEOREM 1 ([9], 1886). Let d=1. The initial value problem 
(1) x(t) =f(t,x(t)) for teJ, x(0)=0, 


has solutions X,nins Xmax SUCH that Xmi(t) S x(t) S Xma;(t) for te J, holds for every 
solution x(-) of (1). 
282 


MATHEMATICAL NOTES 283 


THEOREM 2 ([10], 1890). Let d= 1. The initial value problem (1) has at least 
one solution. 


REMARK. In the case d= 1, Theorem 2 is a trivial consequence of Theorem 1. 
This means that every proof of Theorem 1 is also a proof of Theorem 2. On the 
other hand, the proofs of Theorem 2 are more simple than those of Theorem 1. 

All authors who mention the name of Peano at all agree to call Theorem 2 the 
Theorem of Peano. Kamke [5] and Hille [4] do not even seem to know [9]. Now it 
is not completely clear to which of the two theorems Kennedy’s question does refer. 
Although he cites Theorem 2 at the beginning of his paper, it is obviously Theorem | 
he has in view later on. (Similarly W. Walter in his reply [15] ‘“There is an elementary 
proof of Peano’s existence theorem’’ quotes Theorem 2 (for d = 1) but proves the 
stronger Theorem 1.) 

2.) It can be inferred from the work of Kennedy [6] and W. Walter ([15] and 
Autorreferat in Zentralblatt fiir Mathematik 207 (1971), 84) that these authors call 
a proof elementary if it avoids equicontinuous families and an appeal to the lemma 
of Arzela-Ascoli and if the special properties of R' (as compared with R*) are used 
instead. We shall not enter into the notion of constructiveness emphasized in this 
connection by several authors ((3], [8], [15] but not [9]). 

Do there exist proofs of Peano’s theorems which are elementary in the sense just 
described? 


Ad Theorem 1: Kennedy himself specifies three proofs of this kind: Peano’s 
original proof [9], Perron’s proof [12] and Osgood’s proof [8]. But he takes the 
first two proofs to be defective, the third being rather complicated. W. Walter in his 
reply [15] gives another elementary proof and refers to the proof of Grunsky [3]. 
Moreover, he stresses that Perron’s proof is correct (In the opinion of the present 
author, Peano’s proof is also essentially correct!) 


Ad Theorem 2: In the proof of Theorem 2 for d > 1 the lemma of Arzela-Ascoli 
seems unavoidable. Every time, however, the special case d=1 of Theorem 2 is 
treated separately (e.g., in [1], [2], [7], [13]) its proof is based on the lemma of 
Arzela-Ascoli too. 


Résumé : Although a great number of elementary proofs of Theorem 1 exist, an 
elementary proof of Theorem 2 in the case d = 1 is still lacking in the literature. 
A proof of this kind is of an at least didactical interest, because, as we have seen, the 
case d = 1 of Theorem 2 is treated separately in several textbooks and because, on 
the assumption of Theorem 2, the proof of the existence of X,,:,, Xmax can be given in 
an especially suggestive way (cf., [1], [4]): It is possible to represent x,,;, TCSP. Xmax 2S 
the infimum resp. supremum of the set of all solutions of (1) provided this set is not 
empty. But this is exactly the statement of Theorem 2. 

In the following we shall give an elementary proof of Theorem 2 for d = 1. In 


284 JOHANN WALTER [March 


this proof (just as in [1], [5], [7], [8], [12], [15]) use is made of the integral equation 


x(t) = [ f(s, x(s)) ds 


associated with (1). An elementary proof without this integral equation (as in [2], 
[3], [4], [9], [10], [13]) would perhaps also be interesting. 


2. An elementary proof of Peano’s second theorem for d = 1. Let 
E = {to,t1,°+*s ty}, O=1)p <ty<- <ty = 1, 
be a partition of J with the norm 


|E| = up (t — t-1). 
en 


S 
k=1,+, 
We define x,» =0 and x,=x,-,+(t,-—th-1) fltp-1.%,-1) for K=1,---,N. An 
approximate solution p,(-) is then constructed by joining the points (t,,x,) by a 
polygonal line; p,; is called the Euler polygon associated with the partition E. By 
construction we have 


(2) p2(0) = 0, | p,(t)| SK for teJ, 
| Pe(ts) —- Pe(ta)| SK: | t, —- t, | for t,, t,eJ. 


The following three lemmas are also used (at least implicitly) by W. Walter [15]. 


LEMMA 1. Let-E,, n=1,2,-::, be a sequence of partitions of J such that 
| E,| +0, no. Then a function N(-) defined for positive real numbers exists 
such that (except at a finite number of points where p;,, does not exist) 


(3) n> N(e)>| pe, —f (te, (6) | <e. 


LEMMA 2. Let p and p,, n =1,2,--+, be continuous functions defined in J such 
that p(t) =lim,., p,(t) uniformly in J. Then if 


(4) lim 


pad) — | 0. p,(8)) ds] =0 for tJ, 


p is a solution of (1). 


Now the usual proof of Theorem 2 (d = 1) proceeds as follows: Because of (2) 
the sequence p,, is a family of uniformly bounded equicontinuous functions. On 
account of the lemma of Arzela-Ascoli there exists a subsequence converging uniformly 
in J. For this subsequence (3) holds a fortiori. By integrating (3) we get (4). This 
completes the usual proof. | 

The lemma of Arzela-Ascoli has the following corollary which can also easily be 
proved directly. 


1973] MATHEMATICAL NOTES 285 


LEMMA 3. Let x and x,, n = 1,2,---, be functions defined in J such that 


(5) x(t) = lim x,(t) for every teJ, 
no 
(6) | Xn(t1) — Xa(t2)| SK] t,—t2| for t,t,6J,  n=1,2,-6 


Then we have | x(t) — x(t) | — K| t, —t, | for t,,t, EJ, x, converges uniformly in J 
to xX. 


Using Lemma 3 the application of the lemma of Arzela-Ascoli in the usual proof 
sketched above obviously can be avoided if a sequence of functions satisfying (4), (5) 
and (6) is available. A sequence of this kind can easily be obtained exploiting a 
special property of R', viz., the order relation. Let E,, n = 1,2,-:-, be a sequence of 
partitions of J such that | En | +0, n- o, and for teJ define 


(7) Pnilt) = Sup Det) nsk, 
(8) Pt) = sup Put), 
(9) p(t) = inf p,(t). 


In the following it will be shown that the sequence p,, n = 1,2,---, has the properties 
(4), (5) and (6) required above. Firstly all functions just defined are bounded in abso- 
lute value by K.-Because of (7), p,,,(t) is increasing in k and because of (8), p,(t) is 
decreasing in n. 

Therefore we have 


(10) Prt) = him Put), 
(11) pt) = lim P,(t). 


From (11) we infer that p, satisfies (5). p,, is a polygonal line (with only a finite 
number of jumps in the first derivative) and by (2) and (7) also lipschitzean with 
Lipschitz constant K. Thus for fixed n the polygons p, , satisfy the assumptions (5), 
(6) of Lemma 3. Therefore the convergence of (10) is uniform and p, is also lipschitzean 
with Lipschitz constant K. This means that the sequence p, also satisfies (6). Moreover, 
the relation (3) remains true (uniformly in k) if p,, is replaced by p,.,. Integrating 
this relation we get 


(12) n> N(s) > | a(t) — [ f(s, Pals) ds| <6, 


(also uniformly in k). Passing to the limit with respect to k in (12) we arrive at (4), 
q.e.d. 


286 F. S. VAN VLECK [March 


References 


1. E. A. Coddington and N. Levinson, Theory of Ordinary Differential Equations, McGraw- 
Hill, New York-Toronto-London, 1955. 

2. L. E. El’sgol’ts, Differential Equations, Hindustan Publishing, Delhi, 1961. 

3. H. Grunsky, Ein konstruktiver Beweis fiir die Lésbarkeit der Differentialgleichung y’ = f(x,y) 
bei stetigem f(x,y), Jber. Deutsch. Math. — Verein. Abt. 1, 63 (1960) 78-84. 

4. E. Hille, Lectures on Ordinary Differential Equations, Addison-Wesley, Reading, Mass., 
1969. 

5. E. Kamke, Differentialgleichungen I, 5. Auflage, Akademische Verlagsgesellschaft Geest & 
Portig K.-—G., Leipzig, 1964. 

6. H. C. Kennedy, Is there an elementary proof of Peano’s existence theorem for first order 
differential equations? This MonTHLY, 76 (1969) 1043-1045. 

7. F. J. Murray and K. S. Miller, Existence Theorems for Ordinary Differential Equations, New 
York University Press, New York, 1954. 

8. W. F. Osgood, Beweis der Existenz einer Lésung der Differentialgleichung y’ = f(x,y) ohne 
Hinzunahme der Cauchy-Lipschitzschen Bedingung, Monatsh. Math., 9 (1898) 331-345. 

9. G. Peano, Sull’integrabilita delle equazioni differenziali del primo ordine, Atti Accad. Sci. 
Torino, 21 (1886) 677-685. 


10. ——-—, Démonstration del’intégrabilité des équations différentielles ordinaires, Math. Ann., 
37 (1890) 182-228. 
11, , Opere Scelte, vol. 1, edited by Ugo Cassina, Rome, 1957-1959. 


12. O. Perron, Ein neuer Existenzbeweis fiir die Integrale der Differentialgleichungen y’ = f(x,y), 
Math. Ann., 76 (1915) 471-484. 

13. J. G. Petrovski, Ordinary Differential Equations, Prentice Hall, Englewood Cliffs, N. J., 
1966. 

14, A. Strauss and J. A. Yorke, On the fundamental theory of differential equations, SIAM 
Review, 11 (1969) 236-246. 

15. W. Walter, There is an elementary proof of Peano’s existence theorem, this MONTHLY, 78 
(1971) 170-173. 


A REMARK CONCERNING ABSOLUTELY CONTINUOUS FUNCTIONS 


F. S. VAN VLEcK, University of Kansas and University of Colorado 


In a recent note Goffman [1] gave a short, clear proof of the well-known theorem: 


THEOREM A. A function f whose derivative f’ exists everywhere and is summable 
is absolutely continuous. 


The purpose of this note is to point out that Goffman essentially proved some- 
what more — by trivially modifying his argument one obtains a not so well-known 
characterization of absolutely continuous functions. 

It is well known that the derivative of an absolutely continuous function exists 
almost everywhere. There is also a standard counterexample [cf. 2, p. 168] that 
everywhere existence of f’ cannot be replaced by almost everywhere existence in The- 


1973] MATHEMATICAL NOTES 287 


orem A. In [3, p. 183] and [4, Exercise 18.41 (d)] itis shown that /’ existing everywhere 
can be relaxed to f’ existing except on a countable set. The following theorem shows 
what additional condition must be assumed in order to replace everywhere existence 
of f’ by almost everywhere existence. 


THEOREM B. Let I be a closed interval and f: I- R. Necessary and sufficient 
conditions for f to be absolutely continuous are: 
(i) fis continuous on I. 
(11) f’ exists almost everywhere on I and is summable. 
(iii) For E < J, p(E) = 0 implies n(f(£)) = 0. 


Condition (iii) is sometimes expressed by saying fis an N-function or that f has 
Property N. According to [5, p. 224] this concept is due to N. N. Lusin. By [1, Lemma 
1], f’ existing everywhere implies (iii) and so Theorem B yields Theorem A as an im- 
mediate corollary. 

The necessity part of Theorem B is well known, although condition (iii) seems to 
be neglected in most, but not all, analysis texts. Hewitt and Stromberg [4] give a 
discussion of (iii) and, in particular, give another characterization of absolutely 
continuous functions, due to Banach and Zarecki, which involves Property N. Saks 
[5, pp. 224-228] also discusses this condition and explicitly states Theorem B. Var- 
berg [6, Theorem 3] proves an n-dimensional version of the sufficiency part of Theorem 
B along the lines of Saks’ proof. 

We now show that Goffman essentially proved the sufficiency part of Theorem B. 
We indicate what changes must be made in his argument. First, in [1, Corollary] 
replace the hypothesis “fis everywhere differentiable” by “f has property (iii). The 
conclusion is the same and the proof is essentially the same. Next, replace the hypo- 
theses of [1, Lemma 2] by “‘f satisfies (i)-(iii)””. The conclusion and proof remain as 
in [1]. The rest of the proof of the sufficiency in Theorem B is exactly as in [1, Proof 
of Theorem]. 


D. Varberg has kindly pointed out that these theorems also appear in his article “On absolutely 
continuous functions”, this MONTHLY, 72 (1965) 831-841. 


References 


. C. Goffman, On functions with summable derivatives, this MONTHLY, 78 (1971) 874-875. 

. W. Rudin, Real and Complex Analysis, McGraw-Hill, New York, 1966. 

. H. Kestelman, Modern Theories of Integration, Dover, New York, 1960. 

. E. Hewitt and K. Stromberg, Real and Abstract Analysis, Springer-Verlag, New York, 1965. 
. 8. Saks, Theory of the Integral, Dover, New York, 1964. 

. D. Varberg, On differentiable transformations in R”, this MoNTHLY, 73 (1966) 111-114. 


Nua kb Q N = 


288 W. E. JENNER [March 


ON NON-ASSOCIATIVE ALGEBRAS DERIVED FROM GRAPHS 
W. E. JENNER, University of North Carolina, Chapel Hill 


A construction is given for associating an algebra with any finite graph in 
such a way that the algebra is simple if and only if the graph is connected. The al- 
gebras obtained are in general nonassociative and the simple algebras appear to be 
new. 

A graph, in the sense it will be used here, is a set of objects called vertices, together 
with a collection of two-element subsets called edges. Two vertices belonging to a 
given edge are said to be adjacent. A graph is said to be connected if for any two of 
its vertices p and q there is a sequence of vertices, beginning with p and terminating 
with q, such that any two successive vertices in the sequence are adjacent. A graph is 
finite if it has a finite set of vertices. 

Now let I be a finite graph and K be any field. Let 2I(T) be the set of all functions 
of xF into K. This is made into an algebra in the following manner. First, if 
f.g ENT) and A, we K then Af+yg is the mapping (p,q) Af(p.q) + Lg(p, g). This 
makes (I) into a vector space. The elements {e,,') constitute a basis, where e,, 
is the mapping taking the pair (p,q) into 1¢€K and all other pairs into OEK. 
Multiplication in (I) is defined in terms of the basis elements as follows: 
Ong’ Crs =O if Gd ATS Cyg* Cgs = Eps if p= GQ, 0F q =S, Or p and q are adjacent, 
or q and s are adjacent; otherwise the product is zero. 

If T has n elements and any two vertices are adjacent, then (1) is isomorphic 
to the associative algebra [K],,. In general, however, M(I) will not be associative. 
Suppose, for instance, that p and q are adjacent, q and r are adjacent, but p and r 
are not adjacent. Then (¢,, °€,,) * Cg = rq * Cgr = Cnr» Whereas @,,°(€nq° Cgr) = Crp © pr = 9. 


THEOREM. /f I" is a finite connected graph, then X(L) is a simple algebra. 


Proof. Suppose a is a nonzero ideal of W(T) and that f = YA, gepq (Ap € K) is 
a nonzero element of a. Suppose 4,, 4 0. Then (e,,°f)*e,, = A, @,€a and so e,, Ea. 
Take any p,qeéI. Since I is connected, there exists a sequence p,a,b,-:-,k,r of 
vertices such that any two successive terms are adjacent. Then e,, * (€,4(-++ (€gr* rs) °**) 
= €,,€a. Similarly, operating on the right, it follows that e,,€a and soa = A(T). 
Thus Y%(1T) is simple. 

The structure of the algebra 2(I) can explicitly be determined for any finite 
graph I, not necessarily connected. Indeed let T = I, U.--- UT, be the decomposi- 
tion of I into its connected components. Then A(T) = WT,) + --- + WT, + N,z 
vector space direct sum, where Jt is the subspace of M(I) spanned by those e,, 
where p and q belong to different components I’,. It is verified immediately that 
MN is a zero algebra and an ideal of WT). Now WL) — It is the direct sum of the 
%(T;) and so Jt is the radical of Y(T) in any reasonable sense. The sum of the 


1973] MATHEMATICAL NOTES 289 


W(T;), which is direct, is a subalgebra W, of WT), so that there is a Wedderburn 
Principal Theorem (1) = A, + N. 


The author is indebted to Ladnor Geissinger, conversations with whom suggested that there 
might be algebras of this sort; also to Douglas Kelly for hélpful advice. The work reported here was 
done at the Seminar on Combinatorial Theory held at Bowdoin College in the summer of 1971. 


A FINITE DIFFERENCE PROOF THAT E = mc2 
DONALD GREENSPAN, University of Wisconsin 


Abstract. It is shown that the classical formula E = mc2 follows directly from forward difference 
definitions of both velocity and energy, thus avoiding the necessity of the concept of a derivative. 


In order to reach the reader of minimal background, Taylor and Wheeler [2] 
developed the theory of special relativity using differences, whenever possible, 
rather than derivatives. In this note we shall show that the classical formula E = mc? 
can, in fact, be established entirely without the concept of a derivative. Such a result 
_ is not only of interest in itself, but it also affirms the intrinsic role of finite differences 

in the development of physical models, a result already substantiated by the appli- 
cation of high speed computers in solving nonlinear problems of applied science [1]. 

First, let us summarize in a convenient way the basic concepts which are neces- 
sary for the discussion. Consistently, we measure not only length, but also time, 
in the same unit, meters, as follows. A meter of time, denoted by | meter/c, is the time 
it takes for light to travel one meter. Thus, 


(1) 1 meter/c = (3.335640)10~? sec. 


It is assumed that at every point in Euclidean three-space there is a clock which 
is synchronized with the clock at the origin. When one observes an event and records 
not only its position but also the time on the clock at that position, one says that 
an observation has been made in space-time. The coordinates of an event are of 
the form (x, y,z,¢). With regard to the observation of events in space-time, it will 
be assumed that the coordinate system is inertial and that all laws of physics are 
the same in every inertial reference frame. 

Though time will always be measured in meters, it is sometimes convenient 
to measure speed conventionally as v meters per second, of in light-time as 8 meters 
per meter. Thus, if t; and t, are any two time readings such that 


t, —t, = 1 meter/c, 


and if a particle in motion along an X-axis is at x, at time ¢, and at x, at time f,, 
then we define f and v at t, by the forward differences 


Xa — X4 
t, — tt,’ 


(2) p = 


290 DONALD GREENSPAN {March 


X2—X4 


(3) ”  (t, — 1)(3.335640)10-9 


The units of 8 are then meters per meter, while the units of v are meters per second. 
From (2) and (3) one has 


(4) B = ofc. 
Of course, the speed of light B* is given by 


B* = 1 meter per meter. 


Note also that if a particle has a constant speed B, then (2) does yield this exact 
value from t,,t,,x, and x,. 

Next, consider two inertial frames moving relative to each other in such a way 
that their X-axes are collineal. Call one the laboratory frame and call the second, 
which moves in a positive direction relative to the first, the rocket frame. A light 
flashes and is recorded in both systems. The problem is to relate the coordinates 
(x, y, z, t) in the lab frame to the coordinates (x’, y’, z’, t’) in the rocket frame. Under 
the simplifying assumptions that the flash occurs on the X-axes with y = z = y’ =2’ 
= 0, and that the origins of the two systems are coincident at t = 0, then, if B, 
is the constant speed of the rocket frame relative to the lab frame, and if B, <1, 
the desired relationships are a special case of the well-known Lorentz transformation 
and are given by 


(5) . x = [x’ + Be] [1 — 62]? 
= [B,x’+t’][1 - Be]7". 


With regard to thé time of an event, observe that the variable t, given by 


bo 


(6) 


(7) t=[tP— x}, 
can be rewritten by means of (3) and (4) as 
(8) t= [(t’ —@')]'”. 


Since t is the same in both coordinate frames, it is an invariant which, when 
t? — x? >0, is defined to be the proper time of an event. In observing two events, 
say E, with x = x,, t=t, and E, with x = x,,t =t,, then 


(9) At = [(t2 — ty)? — (x2 - x,)7]'? 


is called the proper time between the two events and is also an invariant under 
transformation (5)-(6). 

Finally, let us now turn to the concept of energy. Consider a particle P of mass 
m which, for simplicity, is in motion only on an X-axis of, say, a lab frame. Its 


1973] MATHEMATICAL NOTES 291 


position is observed at every At = (3.335640)10-° seconds. Let t, and t, be the 
times of two consecutive observations and let x, and x, be the respective X-coordi- 
nates of P at these times. Then the particle’s relativistic energy E* at time t, is defined 
by the forward difference formula 


t, — ft 
10 * = ne 
where the units of E* are units of mass. To convert relativistic energy E* to energy 
E in conventional units requires ({2], p. 103) multiplication of E* by c?, so that 
(11) E = E*¢’, 


By means of (4), (9), and (10), one can then rewrite (11) as 


E 


l 
Lo) 
N 
— 
Pa 
=_™ 
N 
[i > 
a 
=_™ 
— 
~ene 


i 
> 
8 
—, 
er) 
—y 
| 
ra 
mm | & 
NR | 
Hy 
| & 
[2 
Nh 
| 
= 
bo 


l 

.) 
N 
—, 
no 
— 

| 
Bon) 
N 
— 
_ 
~ 
8 


If B <1, then 


es 
ll ll 
3 3 
8, 8, 
rT 
+ 
| 
| + 
+ fore) 
— 
- 
+ 
“eer” 


l 
= 
zs 
4 


For B small, then, 


(12) E ~ mc? + 5 , 


where mv?/2 is the kinetic energy of the particle and mc? is called its rest energy, 
because, when v = 0, 


(13) E = me’. 


Thus, the well-known formula (13) has followed directly from difference formula- 
tions (2), (3) and (10) of the basic physical concepts of velocity and energy. 
It should be noted that, in a consistent fashion, other physical concepts also 


292 B. O. NASH [March 


can be given relativistic formulations in terms of differences ({2], pp. 103-121). 
Thus, for example, in the notation used to define E* in (10), the energy-momentum 
vector can be defined as (m(t, —t,)/At, m(x, — x,)/At). 


References 


1. D. Greenspan, Introduction to Numerical Analysis and Applications, Markham, Chicago, 
1971. 
2. E. F. Taylor and J. A. Wheeler, Spacetime Physics, Freeman, San Francisco, 1966. 


RESEARCH PROBLEMS 
EDITED BY RICHARD GUY 


In this Department the Monthly presents easily stated research problems dealing with notions 
ordinarily encountered in undergraduate mathematics. Each problem should be accompanied 
by relevant references (if any are known to the author) and by a brief description of known 
partial results. Manuscripts should be sént to Richard Guy, Department of Mathematics, Sta- 
tistics, and Computing Science, The University of Calgary, Calgary 44, Alberta, Canada. 


REACHABILITY PROBLEMS IN VECTOR ADDITION SYSTEMS 
B. O. NAsH, State University of New York at Buffalo 


Vector addition systems have arisen in several areas: program schemata, Karp 
[1]; numeric algorithms, Karp [2]; formal languages, Ginsburg [3], and Nash [4]. 
The definitions and notations used here follow Karp [1]. Let N = {0,1,2,-:-}. 
Let Z = {---,—2, —1,0,1,2,---}. Let x and y be ordered n-tuples over Z with 


v= (X1,X25°°sXpq) and y= (Vis Vast» Vn)» 


then define x S vy if x; S vy; for each i, wherei = 1,2,---,n. Define x + y as 
(x4 + Vis Xn + Vn) and —x as (—X,, TXa9°""5 —X,). Let x = (X15 X25 °''s Xm) and 
Vv = (V1, 20°*'s Vz) be tuples over Z. Define x x y as the (m + n)-tuple 


(Xy5Xa0°%ty Xme Vito V9 8s Vn) 


Let 0 denote the zero n-tuple where n is clear from context and 0, if not. 

An n-dimensional vector addition system| is)an|ordered| pair |(y, W)|where | yjis an 
n-tuple over N and Wis a finite set of n-tuples over Z. 

The reachability set R (vy, W) of vector addition system (y, W) is the set of all 
n-tuples x over N, such that either x = y or there exists a sequence w,,W2,°°', W, 
of n-tuples, each in W, such that for each i, where i = 1,2,-:-,k, the sum 
ytw, twat: +w; 2 0 and 


1973] RESEARCH PROBLEMS 293 


X= YVPW,+w.tees +Wy. 


An n-tuple x is reachable in (vy, W) if x isin R (y, W). 

The general reachability problem is ““Given an n-tuple x and an n-dimensional 
vector addition system (y,W), does there exist an algorithm to decide whether or 
not x is in R(y, W)?”’ 

The terms “‘algorithm’’ and ‘‘decide”’ are used here in the sense of ‘‘computable 
by Turing Machine,”’ see Rogers [5]. In what follows, the phrase ‘‘is x in R(y, W)?”’ 
will stand for the full version given above. 

The general reachability problem is now shown to be equivalent (see Rogers 
[5]) to two subproblems. 

The reachable-from-zero problem is “Is x in R (0, W)?”’ 

The zero-reachable problem is: “‘Is 0 in R (y, W)?’’ 

The reachable-from-zero problem was first studied by M. Rabin [personal 
communication]. Theorem 1 was discovered by the author; Theorem 2 was dis- 
covered independently by M. Rabin and the author after joint discussions of these 
problems. 

The term “‘reducible’’ is used as follows: If problem A is reducible to problem B, 
then, if problem B can be solved, problem A can be solved. If A is reducible to B 
and B reducible to A, then A and B are equivalent. See Rogers [5]. 


THEOREM 1. The general reachability problem is reducible to the zero-reachable 
problem. 


Proof. Let x be an n-dimensional vector over N and let (y, W) be an n-dimensional 


vector addition system. 
Let (y x 1, W’) be the (n + 1)-dimensional vector addition system in which 


W’ = {wx 0| weW, 0EN} U {(—x) x (-D}. 


It is claimed that x is in R(y, W) if and only if 0 is in R(y x 1, W’) and therefore 
that the general reachability problem ‘‘is x in R(y, W)?’’ is solved by the zero- 
reachable problem “‘is 0 in R(y x 1, W’)?”’ If x is in R(y, W) by using the sequence 


Wis Way os Wy 
of vectors from W’ then 0 is in R(y x 1,W’) by the sequence 
w, x 0, w, x 0,---,w, x 0, (—x) x (-1) 
of vectors from W’. If 0 is in R(y x 1,W’) by a sequence of the form 
w, x 0, w, x 0,-°-,w, x 0, (—x) x (- 1), 
then x is in R(y, W) by the sequence 


Wi>W5°°*, We. 


294 B. O. NASH [March 


It only remains to show that if 0 is in R(y x 1,W’) then a sequence ending in 
(—x) x (—1) can always be found. But any sequence starting from yx 1 and 
reaching 0 must contain exactly one occurrence of the vector (—x) x (—1) to re- 
move the ‘‘1’’ in the (n + 1) component introduced by yx 1 since (—x) x (—1) 
is the only vector in W’ that contains —1 as (n + 1)-component. Since (— x) x (—1) 
contains no positive components, vectors used after it in the sequence could just 
as well have been used ahead of it. Therefore it can be moved to the end of the 
sequence without invalidating the boundary condition and a sequence of the re- 
quired form can always be found. J 


THEOREM 2. The zero-reachable problem is reducible to the reachable-from-zero 
problem. 


Proof. Let x be an n-dimensional vector over N. Let W be a finite set of n-di- 
mensional vectors over z. 
Let W’ = {-w| weW}. 
Claim 0 is in R(x, W) if and only if x is in R(O,W’). 
For any n-dimensional vectors x, y, z, ifx + y = z then z+(—y) = x. There- 
fore if the sequence 
W1>Wa5°''s We 


shows that 0 is in R(x, W), then the sequence 
—Wry ~Wr-19°'*s 7 Wy 


shows that x is in R(0, W’). The same argument shows that if x is in R(0, W’) then 
0 is in R(x,W). J 


THEOREM 3. These problems are equivalent: 
(a) the general reachability problem, 
(b) the zero-reachable problem, 

(c) the reachable-from-zero problem. 


Proof. Theorems | and 2 show that (a) is reducible to (b) is reducible to (c). 
But any reachable-from-zero problem is also a general reachability problem show- 
ing that (c) is reducible to (a) and that the three problems are equivalent. 

A. Rosenburg [personal communication] has formulated these problems using 
rational numbers as follows: If any vector v = (v,,v,,-°--,v,) over Z is mapped to 
the rational number 


ont pl2 Uk 
P= Py Pr Px 


in which p, is the ith prime for i = 1,2,---,k, then addition of vectors becomes 
multiplication of rational numbers. The zero-reachable problem can be restated: 
Given sEN and s a finite set of rational numbers, does there exist a sequence 


545595 see Sp 


1973] RESEARCH PROBLEMS 295 


of members of s such that for all i = 1,2,---,k the product s,s,---s, is in N and 
S,Sg°0°S, = 1. 

The following results are from Karp [1]: 

It is decidable whether R(y, W) is finite. 

It is undecidable whether R(y,W) = R(y’,W’). 

From Ginsburg [3] comes: 

For all we W if w = 0 then “‘is x in R(y, W)?’’ is decidable. 

In Nash [4], the general reachability problem is shown to be equivalent to the 
emptiness problem for context-free parallel leveled grammars, a type of formal 
grammar investigated by the author in his doctoral thesis. 


Acknowledgement. The author wishes to acknowledge the financial assistance of the National 
Research Council of Canada under Grant A-1617 during this work. 


References 


1. R. M. Karp and R. E. Miller, Parallel program schemata, J. Comput. System Sci., 3 (1969) 


147-195. 
2. R. M. Karp, R. E. Miller, and S. Winograd, The organization of computations for uniform 


recurrence equations, J. Assoc. Comput. Mach., 14 (1967) 563-590. 
3. S. Ginsburg, The Mathematical Theory of Context-Free Languages, McGraw-Hill, New York, 


1966. 
4, B. O. Nash, Context-Free Parallel Leveled Languages, Research Report CSRR 2026, Dept. 


Appl. Analysis and Comput. Sci., University of Waterloo, Canada, September, 1970. 
5. H. Rogers, Theory of Recursive Functions and Effective Computability, McGraw-Hill, New 


York, 1967. 


WHEN DO ALL k-SEQUENCES MODULO m HAVE PERIOD ONE? 
E. A. PARBERRY and NANCY GRAUDONS, Wells College 


For any pair (k,m) of positive integers we define the set of k-sequences mod m 
to be the set of sequences (u,),>, of integers satisfying 0<u,<m, and 
Un+1 =u, +[u,/k](modm), where brackets denote integer part. Obviously, these 
sequences are ultimately periodic. We are concerned with determining those pairs 
(k,m) such that every k-sequence mod m has period one. 

For example, observe that all 2-sequences mod 12 have period one. We need 
only consider the sequences 


(0, 0, vee) 

(1, 1, vee) 

(2,3, 4, 6,9, 1,1, ++) 
(5,7, 10, 3, 4, 6,9, 1, 1, ++) 


296 E. A. PARBERRY AND NANCY GRAUDONS [March 


(8, 0, 0, ---) 
(11, 4, 6, 9, 1, 1,---), 


since all other 2—sequences mod 12 are “‘tails’’ of these. To see that other periods 
are possible, note that the 2-sequence mod 10, (2, 3, 4, 6,9, 3, 4, 6,9, ---), has period 
four. 

Trivially, for m < k, all k-sequences mod m have period one. If we let m(k) 
denote the smallest number such that some k-sequence mod m(k) has period larger 
than one, then we have: 


PROPOSITION. If k > 1, then k(k + 1) < m(k) < k(k + 2). 


Also, letting i(k) be that number such that m(k) = k(k + 1) + i(k), we have 
calculated the following: 


k: 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 
i(k): 1, 2, 3, 2, 2, 4, 1,2, 4 1, 5, 4, 1, 5, 5. 


These values give no hint toward an exact formula for m(k), but they do show that 
the bounds given are best possible. To prove the proposition we need the following 
lemma about the function f, defined by f(n) = n+ [n/k] for integers n = 0. 


LemMa. The function f: Z*+ — Z* is injective, strictly increasing, and _ its 
range is 
ineZt:n Ak (modk + 1)}. 


Proof. That fis strictly increasing, and hence injective, is obvious. To determine 
the range, let neZ* be written n = c(k+1)+56, where OS b<k. If b #k, 
then f(ck + b) = c(k+1)+5, and hence n is in the range. Also, the consecutive 
integers ck +k —1 andck +k map by fintoc(k+1)+k—landc(k+1)+k+1 
respectively, so c(k + 1) +k is not in the range of f. Jj 


Proof of the proposition. The lemma shows that fmaps {neZ+: 0S n < m—1} 
monotonically onto 


{neZt+:0Sn S (m—1)4+[(m-1)/k], nA k(modk + 1)}. 


Hence m<f(n) S$ (m—1)+[(m—1/k] implies OS f(n)—m<k just if 
mS k(k +1). Therefore every k-sequence mod m has ultimate period one if 
ms k(k +1), so m(k)>k(k +1). 

Denoting the rth iterate of f on n by f*(n), we must have f'(k) = k(k +1) +j 
with O<j<k for some unique i, since f(n)>n when n=k, and if 
fi *(k) < k(k +1) < f'(k), then fk) S f(k? +k —1) = k(k + 1) +k —1. Indeed 
since i => 2 for k => 2, the lemma shows f'~'(k)<k? +k—1, whence f‘(k) < 
k(k +1)+k-—2 and 0<j S$ k-2. Now, letting m = k(k +1)+ j +1, we have 


1973] CLASSROOM NOTES 297 


fk) = kk +1)+j+k+1= k(modm), so f'**(k) = f%(k)(modm). There- 
fore, the k-sequence mod(k(k + 1) + j + 1) whose initial term is k has period i + 1. 
With k = 2, we have i => 2 and 0 <j S$ k-2 so 


mk) Ss k(k+1)+j+1<k(k+2). J 


We have not been able to make progress on the following problem. Let M(k) 
denote the largest integer such that all k-sequences mod M (k) have ultimate period 
one. Does M(k) exist for any k >1; and if so, how large is it? Preliminary calculations 
show that if M(2) exists, then M(2) = 197. 


CLASSROOM NOTES 
EDITED BY ROBERT GILMER 


Manuscripts for this Department should be sent to Robert Gilmer, Department of Mathematics, 
Florida State University, Tallahassee, FL 32306; notes are usually limited to three printed 


pages. 
ON INJECTIVE MODULES 


AZMI HANNA, American University of Beirut 


Let R be a ring, not necessarily with an identity element. Among the several 
characterizations of injective R-modules, one finds the following: 


An R-module Q is injective if and only if every monomorphism with domain 
Q has a left inverse. 


The necessity of the condition is immediate, while the proof of the sufficiency 
is usually given after showing that every R-module can be embedded in an injective 
module. We give an elementary proof of the sufficiency of the condition which avoids 
use of this embedding. An analogous proof holds for the dual theorem that an R- 
module P is projective if and only if every epimorphism with codomain P has a 
right inverse. | 

Let Q be an R-module such that every monomorphism with domain Q has a left 
inverse. Let u: E> F be an R-module monomorphism, f : E ~ Q a homomorphism. 
We shall show the existence of a homomorphism g : F -~ Q such that gu =f. 
To do so, form the pushout diagram 


298 H. ELTON LACEY {March 


of the homomorphism u and f. Here P=(F@ Q)/N, where N is the submodule of F@Q 
consisting of all pairs (u(a),—f(a)) with ac E.f'(X) = N + (x,0),u’(y) = N + (0,y) 
for every x EF and y €Q. The above diagram is commutative: f’u=u'f. Further, u’ 
is a monomorphism, for if u’(y) = N + (0,y) = 0, then (0,y) = (u(a), —f(a)) for some 
aéE, Since u is a monomorphism, we must have a = 0 and consequently y =0. By 
assumption, there exists a homomorphism v: P > Q such that vu’=19. Let g=of’: 
FQ. Then gu = of’u = vu'f = f. Thus Q is injective. 

The above proof and its dual can be generalized to any abelian category which 
may fail to have enough injectives or enough projectives. For instance, it is shown 
in [1, p.260] and in [2, p.257] that the category of sheaves over a topological space 
has enough injectives but not necessarily enough projectives. 


References 


1. R. Godement, Topologie algébrique et théorie des faisceaux, Hermann, Paris, 1958. 
2. B. Mitchell, Theory of Categories, Academic Press, New York, 1965. 
THE HAMEL DIMENSION OF ANY INFINITE 
DIMENSIONAL SEPARABLE BANACH SPACE IS c. 


H. ELTon Lacey, University of Texas at Austin 


In this MONTHLY [1] an elementary proof of the fact that an infinite dimensional 
Banach space cannot have dimension N, is presented. A simple argument can also 
establish (without the continuum hypothesis) the statement in the title of this note. 

Let X be an infinite dimensional Banach space. It suffices to demonstrate a 
linearly independent set in X of cardinality c. By the Hahn-Banach theorem and the 
fact that X is infinite dimensional, there are sequences {x,} in X and {x*} in X* such 
that x*(x,) #0 and x*(x,,) = 0 for n 4 m. In particular, {x,} is linearly independent 
and for each n, x, is not in the closed linear span of {x,,: m # n}. Now let {N,}o<,<; 
be a family of subsets of the positive integers such that N, QO N,, is finite for t 4 t’ and 
each N, is infinite (see [2] for an easy proof of this). It is easy to see that the family 
{x,}o<r<1, Where x, = Lnen, X,/2" is a linearly independent family in X. 

If, in addition, X is separable, it follows that the Hamel dimension of X is c since 
the cardinality of X is c. 

For a recent result on dimension, see [3] where the authors show that the only 
barreled spaces of countable Hamel dimension are those isomorphic to the space of 
finitely non-zero sequences in its strongest locally convex topology. 


References 


1. W. R. Bauer, and R. H. Benner, The non-existence of a Banach space of countably infinite 
Hamel dimension, this MONTHLY, 78 (1971) 895-896. 

2. J. R. Buddenhagen, Subsets of a countable set, this MONTHLY, 78 (1971) 536-537. 

3. S. Saxon, and M. Levin, Codimensional subspaces of a barrelled space, Proc. Amer. Math. 


Soc., 29 (1971) 91-96. 


1973] CLASSROOM NOTES 299 


A NOTE ON CONFORMALITY 
R. K, WILtrams, Southern Methodist University 


In the following note, it is shown how a restricted orthogonality requirement 
implies analyticity. As a special case, we get the well-known result that conformality 
implies analyticity. 

Suppose that f(z) = u + iv is a transformation of a domain D in the z-plane 
into the z-plane. Let u and v be continuous with continuous partial derivatives in 
D, and let the Jacobian of the transformation 


be nonzero in D. (Hence f(z) is locally one-to-one in D.) 

For z, €D, let L,(Zo), L2(Zo), L3(Zo), and L,4(Zo) be line segments in D, through 
Z ), and having angles of inclination 0, 7/2, 7/4, and 32/4 respectively. Suppose 
that for each z ED, f(L,(Zo)) and f(L3(zo)) are orthogonal to f(L,(z,)) and 
f(L4(Zo)) at f(z), respectively. We then have the following: 


THEOREM. Either f(z) or f(z) is analytic in D. 


Proof. Let z9 =X 9 +iyo¢D. Then parametric representations for L,(zp), 
L2(Zo), La(Zo), and L,(Zo) are 


pee [Fe * poe Pore 
Y=Yo> Y=)ot tl, Y=)otl, Y=)o-!. 


Also, parametric representations for f(L,(z,)), f(L2(Zo)), f(L3(Zo)) and f(L4(o)) 
are 
| = U(X + t,Yo)s 


v= (Xo +t, Yo), ete. 


Letting T,, T,, T;, and T, be tangent vectors to f(L,(Zo)), f(L2(Z9)), f(L3(Z0)), 
and f(L,(z,)) at f(z>), and using complex notation, we see that 


T, =u, +iv,, T, = uy t+ ivy, Tz = (u, + uy) + i(v, +0,), and 
T, = (u, _ Uy) + i(v,, _ Vy) ’ 


where the partial derivatives are evaluated at z). Since by assumption, T, and T, 
are orthogonal to T, and T, respectively, we have 


(1) U,Uy + 0,v, = 0 


(2) us—uyt+oe—vy = 0. 


300 L. C. EGGAN AND A. J. INSEL [March 


From (1) we see that there is a number c = c(z,) such that 
(3) uy = —cv, and v, = cu,. 
Using (2) and (3), we have 

uys+v2 = uy + y= c7(v2 + u2). 


Since J £0, u2+ 02 # 0, so c(zo) = +1 for each z9¢D. Equations (3) imply 
that c(z) is continuous at each z, € D. Thus c(z) = 1 or c(z) = —1. 

If c(z) = 1, equations (3) imply that f(z) is analytic in D. If c(z) = —1, the 
analyticity of f(z) follows similarly. 


CoROLLARY. Under the hypotheses of the theorem, f(z) is either angle preser- 
ving or angle reversing in D. 


Proof. From the theorem and the fact that J # 0 in D, either f(z) or f(z) is 
analytic with a nonvanishing derivative in D. Thus either f(z) or f(z) is angle pre- 
serving in D. The corollary follows. | 

It is clear that conformality implies our (seemingly weaker) orthogonality con- 
dition. The well-known fact that conformality implies analyticity now follows from 
the proof of the corollary. 


A WRONSKIAN CONDITION RELATED TO ORDINARY DIFFERENTIAL EQUATIONS 
L. C. EGGAN and A. J. INSEL, Illinois State University 


It is well known that in general the vanishing of the Wronskian of n + 1 functions 
on the real line R is not sufficient to establish their linear dependence on R. Ina 
differential equations course recently one of us asked the question: 

Under what conditions does the Wronskian of a finite sequence of functions 
being zero imply that the functions are linearly dependent? 

Naturally the expected answer was: 

It is sufficient that the functions all be solutions to a differential equation of 
the form 


Amy + Ag ty" Fo Hayy’ tay = 0, 


Where Qin» Q4m—19°*'s41,49 are continuous on an interval over which a,, is never 
zero. 

One student suggested (actually stated) that it is sufficient for the sequence 
to consist of a function and its derivatives. : 

In this note we prove this assertion and as a corollary obtain a sufficient condition 
for a function to be analytic. Our proof is a refinement of the following theorem 
to be found in [1, p. 47] which is stated here for completeness. 


1973] CLASSROOM NOTES 301 


THEOREM 8. Let @,(t), 62(t),-°:,¢,(t) be any n functions continuous together 
with their first (n—1) derivatives over some interval t, <t<t,. 
(a) If the @’s are linearly dependent over t, <t<t,, then their Wronskian 


W(1, 2,5 Pn) = 9, ty St < ty. 


(b) Suppose 

(1) W(d,, b2,°+:,%,) = 0, ty <t <tp; but 

(2) for some (n—1) of the @’s (say, without loss of generality, all but @,) 
W(1, 25° Gn—-1) # 0, all t, ty <t <t,, then ¢,,02,-°+,¢, are linearly depen- 
dent over ty<t<ty,. 


With the aid of the above theorem and a modest background in real variable 
theory our theorem can be presented in a first year differential equations course. 
The converse of our theorem is well known (cf. [2], pp. 99-100). 


THEOREM. If f is a real valued function defined on the real line R and if there 
exists a positive integer n such that f is 2n-fold differentiable on R and 
Wf’, f) =0 on R, then f is the solution of an n-th order homogeneous 
linear differential equation with constant coefficients not all of which are zero. 
In particular, f is analytic. 


Proof. We first suppose n is minimal in this regard, so 
(1) WS’. of m~D) 4 0 


at some point of R. Hence by the continuity of the first 2(n—1) derivatives of f, 
(1) holds on some connected set J in R. Choose I to be a maximal connected set on 
which (1) holds, and note, again by continuity, that I is open. 

Now apply Theorem 8(b) as stated above to conclude that f, f’,---,f™ are 
linearly dependent on J. (Note that, in the context of this paper, it is required in 
the statement of Theorem 8 that f°” be continuous. However, the proof makes 
no use of this hypothesis and hence the continuity of f°” may be ignored.) Con- 
tinuing with the argument, by linear dependence there exist real numbers do, a;, +++, a, 
not all zero, such that Xj. ,a,f=0 on I. Let g be the solution to the differ- 
ential equation 


(2) x ay =0 
j=0 
which is defined on all of R and is such that f = g on I. Then f™ = g“ on I 


and hence, by continuity, also on the closure of I, which we denote by /, for 
0 <j Ss 2(n—1). Thus we must have 


(3) WF’, 1 fOY) = W(9,9', ig") on I. 


302 E. A. BENDER [March 


Now since g,g’,:::,g" '» are solutions to (2) whose Wronskian does not 
vanish on J, it is well known (cf. [1], Theorem 7, p. 47) W(g,g',:,g" °») does 
not vanish on all of R. Thus by (3) inequality (1) holds on /. But J was maximal, 
so the open set I must equal J. This can happen only if 1 = R, and therefore f = g 
on R as desired. 


References 


1. W. Hurewicz, Lectures on Ordinary Differential Equations, Technology Press of M. I. T. and 


Wiley, New York, 1958. 
2. E. Rainville, Elementary Differential Equations, 3rd edition, Macmillan, New York, 1964. 


MATHEMATICAL EDUCATION 
EDITED BY J. G. HARVEY AND M. W. PoWNALL 


Material for this Department should be sent to either of the editors: J.G. Harvey, Department 
of Mathematics, University of Wisconsin, WI53706; M. W. Pownall, Department of Mathema- 
tics, Colgate University, Hamilton, NY 13346. 


TEACHING APPLICABLE MATHEMATICS 


E. A. BENDER, Institute for Defense Analyses 


Introduction. In this article I shall examine what appear to be the purposes of the 
usual two year calcultis service course, (I include differential equations in “‘calculus’’) 
what I think they should be, and how these latter could be achieved. What I have to say 
is not intended for courses for undergraduates planning to enter pure mathematics, 
but those students might also profit from the proposed approach. 

I want to suggest a philosophy — an approach. As a result the details of carrying 
out the ideas are not described. Filling in these details amounts to writing a course 
which is by no means trivial. Although it would be difficult, I believe it is not impossible. 


Purposes of the course. There seem to be three traditional goals for the calculus 
sequence: 

(1) Learn some useful mathematics. 

(ii) Develop a feel for one or more areas of mathematics. (Analytic geometry, 
linear algebra or elementary probability are sometimes included.) 

(111) Obtain an idea of what mathematicians do and of the beauty of mathematics. 

We regard (1) as the analytic version and (ii) as the synthetic version of the same 
idea. The last goal might be labeled “‘culture” and is frequently absent. To some 
extent (ii) and (iii) are reasonable outgrowths of (i). 


PROBLEMS AND SOLUTIONS 
EDITED BY Emory P. STARKE 


ASSOCIATE EDITORS: JOSHUA BARLAZ, ERIC S. LANGFORD. COLLABORATING EDITORS: LEONARD 
CARLITZ, GULBANK D. CHAKERIAN, HASKELL COHEN, S. ASHBY Foore, ISRAEL N. HERSTEIN, 
Murray S. KLAMKIN, DANIEL J. KLEITMAN, ROGER C. LYNDON, MARVIN MARCUS, CHRISTOPH 
NEUGEBAUER, ALBERT WILANSKY, and UNIVERSITY OF MAINE PROBLEMS GROUP: EARL M. L. 
BEARD, GEORGE S. CUNNINGHAM, CLAYTON W. DODGE, OSKAR FEICHTINGER, WILLIAM R. 
GEIGER, GARY HAGGARD, PHILIP M. LockE, JoHN C. MAIRHUBER, CURTIS S. MorsE, 
GRATTAN P. Murpny, EDwarD S. NoRTHAM and WILLIAM L. SouLE, JR. 


All problems (both elementary and advanced) proposed for inclusion in this Department should 
be sent to E. P. Starke,1000 Kensington Ave., Plainfield, NJ 07060. Proposers of problems 
are urged to enclose any solutions or information that will assist the editors. Ordinarily, prob- 
lems in well-known textbooks and results in generally accessible sources are not appropriate 
for this Department. No solutions (except those accompanying proposals) should be sent to 
Professor Starke. 


ELEMENTARY PROBLEMS 


Solutions of Elementary Problems should be sent to Problems Group, Mathematics Department, 
University of Maine, Orono, ME 04473. To facilitate their consideration, solutions of Elemen- 
tary Problems in this issue should be typed (with double spacing) and should be mailed before 
June 30, 1973. Contributors (in the United States) who desire acknowledgment of receipt 
of their solutions are asked to enclose self-addressed stamped postcards. 


An asterisk (*) means neither the proposer nor the editors supplied a solution. , 


E 2403. Proposed by W. A. Al-Salam, University of Alberta, and A. M. Chak, 
West Virginia University 


It is known that a generalization of the binomial theorem is Euler’s identity 


5 F(n,g)ght 9? 
1 t+x)(i+qx)-A+ qx) = 1 —-- 
(1) ( yl + qx)--- (1 + g™*x) ~ Fn -k OK) 
where q is a fixed complex number which is not a primitive kth root of unity for any 
k = 2, where 
k 


F(k, q) = I] (+q+---+q/~'), 


j=1 


and where F(0,q) = 1. (Thus F(k,1) = k!.) 
Show that the only solution of 


(2) (1+ x)(L-+ ex) (Lt eax) = DL — Peat 
k=0 On—KPE 


315 


316 ELEMENTARY PROBLEMS AND SOLUTIONS [March 


is this identity of Euler. That is, if given a sequence c,, c2,:-- of complex numbers 
there exist sequences op, 1, °** 3 Xp, &1,°** 3 Bo, By, °** Such that (2) holds identically 
for all values of n, then necessarily c, = c,/c,-, = q is constant for n = 2,3,--- and 
thus (2) must have the form of (1). 


E 2404*. Proposed by Russell Maurer, Harvard Medical School 


At Smith College, the graduation exercises traditionally proceed as follows: 
Although each diploma is made out to a particular girl, all the diplomas are initially 
given out at random. All of the girls who do not get their own diplomas then form a 
circle, and each passes the diploma she has to the girl on her right. Those who now 
have their own diplomas drop out, and the remaining girls again pass their diplomas 
to the right, and so on. This procedure is repeated until each girl has her own diploma. 
Ifthere are n girls in the graduating class, what is the probability thatit takes precisely 
k passes before each girl has her own diploma? 


E 2405. Proposed by R. E. Shafer, Lawrence Radiation Laboratory 


Forman S. Acton in his book Numerical Methods that (almost) Work (Harper- 
Row, New York, 1970, pp. 29-40), proposed several numerical methods for the 
evaluation of 


20 1 
Derive the following additional form: 


1 a b ea) pb 2nti 
F(b) = — $ log blogs + (2n + 1)?’ 0<b<1. 
E 2406. Proposed by Erwin Just and Norman Schaumberger, Bronx Com- 
munity College 


What is the maximum value of a and the minimum value of B for which 


1 n+a 1 n+p 
(1+—) ses (1+—) 
n n 


for all positive integers n? 
E 2407. Proposed by A. W. Walker, Toronto, Canada 


Given the circumcznter O, orthocenter H, and incenter I of an unknown triangle 
T, (A) locate by Euclidean construction the Gergonne point and the Lemoine point 
of T (i.e.,the centers of perspective of T with the triangles formed respectively by the 
contact points ofthe sides of T withits incircle and by the tangent lines at the vertices 
of T to its circumcircle). (B) Locate the orthocenters of the pedal triangles of H 
and I. 


1973] ELEMENTARY PROBLEMS AND SOLUTIONS 317 


SOLUTIONS OF ELEMENTARY PROBLEMS 
A Set of Unrelated Primes 


E 2293 [1971, 405; 1972, 302]. Proposed by Erwin Just, Bronx Community 
College 


Does there exist an infinite set of primes, S, such that whenever peS and qgeS 
we have (4(p — 1), 4(q — 1)) = 1, (Bg — 1) = 1 and (p—1, q) =1? 


Solution by Frederick Carty, Parsippany, New Jersey. A set with the desired 
properties does exist. We shall demonstrate this by constructing S = {p,, p2,:--} 
inductively. Let p, = 3 and p, = 5, and suppose that p,, p2,---, p, have been chosen. 
Let d, =[]j=1 42; (P; — 1); by Dirichlet’s theorem, there exists a prime p,,4, of the 
form 2kd,, — 1. Then for i Sn, obviously (p;, p,4, — 1) = (p; — 1, P41) = GO; — 1), 
4(p,4+41—1)) =1 and we are done. 


Also solved by Anders Bager (Denmark), Problem Solving Group Berne (Switzerland), R. T. 
Bumby, Frederick Carty, M. S. Demos, R. J. Dickson, Harold Donnelly, Neal Felsinger, Heiko 
Harborth (Germany), C. V. Heuer & G. A. Heuer, Emmett Keeler & Joel Spencer, Harry Lass, 
L. E. Mattics, Kenneth Schilling, Paul Smith, Karl Stoop (Colombia), Charles Wexler, Gregory 
Wulczyn, and the proposer. Partial solutions by John Coolidge, and by Bernardo Recaman (Colom- 
bia). 


Rearrangement of Series 


E 2343 [1972, 303]. Proposed by G. A. Heuer, Concordia College 


According to a well-known theorem of analysis, a series of real numbers is un- 
conditionally convergent (i.¢., Lagm) = LX a, for every permutation @ of the 
positive integers) if and only if it is absolutely convergent. Certain kinds of rearrange- 
ments, however, will leave the sum of an arbitrary convergent series unaltered. 
(A) Prove that if ¢(n)—n is bounded, and a, is any convergent series, then 
X am) = YL a,. (B) Prove or disprove: If ¢(n)—n is unbounded, then there is 
a series La, for which LY aga) A LX ay. 


Solution by David Monk, Mathematical Institute, Edinburgh, Scotland. To 
show part (A), let s,, and t,, be the mth partial sums of  a,and 2% a4) respectively. 
Suppose that n—k < @(n) Sn+k for all n. If m>k and 0<rsm-—k, then 
@—(r)—k < o(@-'(r)) =r S$ m—k so that g-'(r) S m. Thus 


{1,...,.m—k} < @({1,..., m}) 


so that t,, includes all terms a, for whichn < m — k. On the other hand, ¢(n) S$ m+k 
for n = 1,...,m So t,, excludes all terms a, for which n > m + k. Therefore whenever 
m>k, 


| tm — S| Ss | Om n+ 1 | +... + | On +r | 


318 ELEMENTARY PROBLEMS AND SOLUTIONS [March 


which approaches 0 as m —> oo since a, — 0 and k is fixed. 

As for part (B), the assertion is false. Let @ be the permutation which inter- 
changes 1 and 2, 3 and 5, 6 and 9,...,4n(n + 1) and 4n(n + 3),... and leaves all 
other integers unchanged. Since $n(n + 3) — 4n(n + 1) = n, it follows that d(n) —n 
is unbounded. Suppose that dia, is any convergent series with sum s. We have 
(in the notation of part (A)) two possibilities for t,, — s,,. If m = 4n(n + 3) for some 
natural number n, then t,, — s,, = 0; otherwise 4n(n + 1) S m < 4n(n + 3) for some 
(unique) natural number n and in this case 


ln — Sn = An(n+3)/2 ~ An(n+1)/2° 


We conclude that since s,, > s likewise t,, > s. 


Also solved by Anders Bager (Denmark), Roby Ballard & George Zahn, J.A. Belward (Australia), 
M. T. Bird, D. M. Bloom, D. L. Costa, R. J. Dickson, David Farnsworth, Lowa Problem Group, 
J. R. Kuttler, Joel Levy, H. Sarbadhikari (India), Kenneth Schilling, Gary Thomas, and the proposer. 


Editor’s comment. Monk also refers to R. P. Agnew, Permutations preserving convergence of 
series, Proc. Amer. Math. Soc. 6 (1955), 563-564, where it is shown that a necessary and sufficient 
condition that La,,,, converge whenever 2 a, does (and to the same sum) is the existence of an 
integer N such that for each n the set {@ (r): 1 S r S$ n} is expressible as a union of not more than 
N blocks of consecutive integers. In the example used in (B) above, N = 3. 


Jumping Around in an Ellipse 
E2345 [1972, 303]. Proposed by E. S. Langford, University of Maine 


Let S be a nonempty compact subset of the plane. A sequence {P,} of points 
of S has the following property: 


(P,P y41) = max{d(P,, P): Pe S}. 


Let d, = d(P,» P41). Then obviously d, S$ d, < ... S$ 6, where 6 is the diameter 
of S. Let d = limd,,. (a) Is it possible that d < 6? (b) Is it possible that the sequence 
{d,} is strictly increasing? (c) Is it possible that {d,} is strictly increasing and, in 
addition, that d < 6? 


Solution by H. S. Witsenhausen, Bell Telephone Laboratories. The answer 
to all three questions is yes. In fact, we can show that examples of (a) exist if and 
only ifo < /3d , and that an example for (c) exists for every d such that /3d > 0. 

Let {P,,} be any sequence of the type described in the problem. For ¢ > 0, choose 
nso that d—ée < d,.If QES, then d(Q,P,) < d, < dand d(Q,P,41) S dn41 34; 
so that S is contained in the intersection of the two circular disks of radius d centered 
at P, and P,,,,. The diameter of this intersection does not exceed J 3(d + e) from 
which it follows that 6 < J3d, 1e., that d = /36/3. We can achieve equality by 
taking S to be a rhombus which is the union of two equilateral triangles, and then 
taking for P,, alternately the ends of the shorter diagonal. 


1973] ELEMENTARY PROBLEMS AND SOLUTIONS 319 


One can construct an example with {d,} strictly increasing and 5 = (./3—e)d 
for arbitrary e > 0 as follows: Let S be the union of an eccentric ellipse (with unit 
major axis) together with two points symmetrically located on the extensions of the 
minor axis which are a distance ./ 3 — e apart. If P, is chosen on the ellipse, suffi- 
ciently close to one end of the major axis, the sequence {P,,} is uniquely determined, 
the d, are strictly increasing, and d = 1. [This is probably most easily seen by 
noting that given P,, the point P, must lie somewhere strictly between the other 
end of the major axis and the point antipodal to P, . This can be shown by standard 
calculus techniques. — Ed. | 


Also solved by J. C. Binz and the Problem Solving Group Berne (Switzerland), J. W. Boyd, R. 
J. Dickson, D. P. Giesy, G. A. Heuer, Ralph Jones, E. Keeler & J. Spencer, C. J. Knight, O. P. 
Lossers (Netherlands), Bill Margolis, L. E. Mattics, G. L. Mille#, Bill Sands, and David Weinberger. 


Editor’s Comment: The ellipse plus two points example was used by several solvers. Others used 
two non-overlapping circles together with two isolated points; and still others used appropriate 
unions of circular arcs. Dickson and Weinberger (independently) observe that we can take the convex 
hull in any of the examples above, and still have an example. Jones notes that the only way we can 
have 6 = d,/ 3 is for {P,,} to alternate (after the first term) between two points so that d,, is constant 
forn = 2. 


Groups with Small Centralizers 


E 2346 [1972, 303]. Proposed by Louis Shapiro, Howard University 


Say that a group has small centralizers if every non-identity element commutes 
only with its inverse, itself, and the identity. Characterize all groups with small 
centralizers. 


I. Solution by the Problem Solving Group Berne, Switzerland. We shall show 
that the only groups with small centralizers are the cyclic groups of orders one, two, 
and three and the symmetric group S,. If o(G) S 3, then certainly G has small 
centralizers, so assume that G has small centralizers and that o(G) 2 4. Since 
aa? = a*a by the associative law, every g ¢G other than the identity e has order 
2 or 3. But not every element in G other than e can have order 2, for if a and b are 
distinct elements of order 2 and ab is also, then ab = (ab)~! = b-1!a-! = ba con- 
trary to assumption. Hence G must contain at least one element b of order 3; this 
implies the existence of at least two elements of order 3 since b? # bis also of order 3. 

We show now that G must have exactly two elements of order 3. Suppose to 
the contrary that céG is an element of order 3 distinct from b and b*. Then there 
are at least four elements of order 3: b, b?, c, c’. Consider the elemerit bc. Either 
o(bc) = 2 or o(bc) = 3. Suppose that o(bc) = 2. Now cb ¥ bc by assumption and it 
is easy to show that o(cb)=2. But (bc)(cb) is not the identity and 
(becb)? = be*b?c*b = be*b-'c-1b = be*(cb)~1b = be?(cb)b = e. This implies that 
o(becb) = 2 and therefore that bc and cb commute as above. This is a contradiction. 


320 ELEMENTARY PROBLEMS AND SOLUTIONS [March 


Suppose that o(bc) = 3. Consider the elements bc* and b?c. They are distinct, 
neither is the identity, and they are not inverses of each other since 
(bc?)-* = c~7b-1 = cb” and c and b* do not commute by assumption. But 
(bc)? = (bc)~! = c71b-! = cb? so that (bc?)(b*c) = b(c*b*)e = b(bc)*c 
= (b*c)(bc*) contrary to assumption. We have now shown that G has precisely 
two elements of order 3: b and b?. | 

Since 0(G) = 4, it follows that G has an element g of order 2. Then bg and bg 
are two further elements of G which must be of order 2 and K = {e, b, b*, g, bg, bg} 
is a subgroup of G isomorphic to the symmetric group S,. This must exhaust G 
for suppose there exists an element h not in K. Necessarily o(h) = 2 and since 
ghéK it follows that o(gh) = 2 and therefore that gh = hg as before. Since S, 
obviously has small centralizers, we see that G has small centralizers if and only if 
it has order not exceeding 3 or is isomorphic to S;, that is, if and only if G is 
isomorphic to a subgroup of S;. | 


II. Solution (G finite) by D. M. Bloom, Brooklyn College. If o(G) = n and there 
are a and b conjugacy classes of elements of orders 2 and 3 respectively, then since 
each element centralizes only its own powers, we have for the class equation of G 


niuph 


D ztlan; 


a 
that is, n(6 — 3a — 2b) = 6 so that n is a divisor of 6. If n = 1, 2, or 3 we have 
the cyclic groups which do have small centralizers, whereas if n = 6, G cannot 
be abelian, and must therefore be S, which does have small centralizers. 


Ill. Solution (G finite) by H. D’Alarcao and T. Moore, Bridgewater State 
College. If G is finite and non-trivial and has small centralizers, then so does every 
subgroup of G, in particular every Sylow p-subgroup Gp of G. Since Gp has non- 
trivial center, o(Gp) = p. But every element of G has order either 2 or 3 so that 
every Sylow p-subgroup of G has order either 2 or 3. Hence the order of G is either 
1, 2, 3, or 6. The cyclic group of order 6 does not have small centralizers, so that 
the only groups with small centralizers are the cyclic groups of order 1, 2, or 3 and 
the symmetric group S,. | 


IV. Solution (infinite case) by John Comiskey, Monsignor Farrell High School 
(New York City). Having determined the finite groups with small centralizers, 
Suppose that G is an infinite group with small centralizers. Choose seven elements 
of G and consider the subgroup H generated by these elements. A group generated 
by a finite number of elements whose orders do not exceed 3 is finite (see B. H. 
Neumann, Groups whose elements have bounded orders, J. London Math. Soc. 11 
(1937), 195-198) so that H is a finite group with small centralizers whose order 
exceeds 6. This is a contradiction so that there does not exist an infinite group with 
small centralizers. 


1973] ELEMENTARY PROBLEMS AND SOLUTIONS 321 


Complete solutions submitted by Anders Bager (Denmark), the Bennett College Team, D. M. 
Bloom, D. E. Bridgewater, John Comiskey, D. A. Sibley, Brian Wesselink, and the proposer. Partial 
solutions (assuming G finite) by H. D’Alarcao and T. Moore, Annette Dittmer, M. G. Greening 
(Australia), C. V. Heuer, Barbara Keller, Desmond MacHale (England), David Ritter, Steven Russ, 
Bruce Staal, and S. Srinivasan (India). Partial solution by M. R. Modak (India). 


Editor’s comment. For the case of G infinite, most solvers made reference to the special case 
n = 3 of the Burnside problem to reduce the problem back to the finite case. (See Marshall Hall, 
Theory of Groups, Macmillan, 1959, Section 18.2.) The solution by the Berne Group is interesting be- 
cause it does not use this result. 


Symmedian Point of a Triangle 


E 2347 [1972, 303]. Proposed by Leonard Carlitz, Duke University 


Let P denote a point in the interior of the triangle ABC. Let a, B, y denote the 
angles of ABC. Let R,, R,, R; denote the distances from P to the vertices of ABC, 
and let r,, 72, r; denote the distances from the sides of ABC. Show that 


Risin? « + R3 sin? B + R3 sin? y <3(rt +73 +73) 
with equality if and only if P is the symmedian point of ABC. 


Solution by Ralph Garfield, The College of Insurance, New York. Let 6, = 
x PAC and @, = x BAP. Wethenseethat r, = R, sin 0, and r, = R, sin 0,, so that 


r, = R, sin(a — 0.) = R, (sinacos0, — cosasin6,) 


R, sinacos@, — r, cosa. 


Therefore 
r? sin? «a + (rp +1,c0s a)” = Rj sin?6, sin’« + Rj sin” «cos70, 
which is r? + r? + 2r,r, cosa = Rj sin’a. It now follows that 
R? sin?a + R3 sin?B + R3 sin’y 

= Ar7t+rz+r2)+2(r,rz cosa +1r4r3 cosB + rz; cosy). 
To complete the problem it suffices to show that 

2(r,r, COSa + 1,13 COSB +1rgrz Cosy) Srp + rz try. 
Using Lagrange multipliers, we maximize 

L(a, B,y, A) = 2(r1,r, cosa +7r,rzcosB + rzr;,cosy) +Aa+B+y—2). 

We find that 


A = 2r,r,sing = 2r,r;sinB = 2r,r,siny. 


322 ELEMENTARY PROBLEMS AND SOLUTIONS [March 


Since none of r,, rz, 73 is equal to zero, we have 


rz sina =r, sin B, 


r, sing =r, siny r, sin(a + B) =1r; (sina cosB + cosasin f) 
= r,sinacosP + rz sinacosa. 
Now assuming sing #0 we find 
(r, —1r, cosa)” + (r, sina)” = r7sin?B + 13 cos”B = r3. 
Transposing terms gives 
rrp cosa=17 +13 — 9%. 
A similar argument yields 
2r,r; cosPp = 17 +72 — 75, 
2ror, cosy =p +173 —17. 
This extremum is a maximum, so we have shown that 
2rr, Cosa + 2r,r; COSB +2 rar; cosy ST, + 1y + 75. 
Now, it is known (see Bottema et al., Geometric Inequalities, Groningen, 1969, 


p. 23, no. 2.20) that if x, y, z are arbitrary positive numbers, then 


yZ ZX xy 
% < ane we 
xcosa+ ycosB+zcosy S$ ax + dy + rs 


with equality if and only if 


where a, b, c denote the sides of ABC. Therefore 
XR? sin’ $3 Ur; 

with equality if and only if 

(*) hyitg: rs =a: bic. 


But it is known that (*) holds if and only if P is the symmedian point of ABC. 
(See, e.g., R. A. Johnson, Modern Geometry, Boston, 1929, p. 214.) This com- 
pletes the proof. 


Also solved by Leon Bankoff, M. G. Greening (Australia), Hans Kappus (Switzerland), M. S. 
Klamkin, F. Leuenberger (Switzerland), Simeon Reich (Israel), C.S. Venkataraman (<‘ndia), and 
the proposer. 


1973] ELEMENTARY PROBLEMS AND SOLUTIONS 323 


Non-Negative Forms 


E 2348 [1972, 304]. Proposed by Leonard Carlitz, Duke University 


Let P be a point in the interior of a triangle ABC. Let R,, R2, R; denote the 
distances from P to the vertices of ABC and let r;, r2, r, denote the perpendicular 
distances from P to the sides of ABC. Show that 


(1) UR (ry +73) 2 U1 + 12) (11 + 73); 
(2) X(R; + Ro) (Ry + R3)24 2%, +12) 11 +175), 
with equality if and only if ABC is equilateral and P is its center. 


Solution by M.S. Klamkin, Ford Scientific Laboratory. To satisfy (1) we prove 
a stronger inequality. For the triangle ABC let a, b, c be the lengths of the sides BC, 
CA, AB, respectively. From [1, p. 107] we have 
roc +1r3b ryb+r,a 


ric + 2a 
R, = ——————_ R,> 3 R, > — 
(3) i= a ) 2= b 7) 3 = C 9 


with equality if and only if ABC is equilateral and P isits center. We now prove that 
(4) La~*(rae + r3b) (r2 +73) 2 U(r, + 12) (11 + 73). 


This inequality implies (1). This inequality is actually valid for all real r,, r,, r3 since 
it will be shown to be a non-negative quadratic form with equality if and only if 
a= b=c. The matrix associated with (4) is given by 


[ b? +c? — be a+b—3c a+c—3b 
be 2c 2b 

b+a— 3c c? +a?—Cca b+c—3a 
2C ca 2a 


c+a— 3b c+b—3a a* +-b*—ab 
2b 2a ab 


As is well known, (4) is a non-negative form if the three principal minors M,, M2, M, 
of M are non-negative. After some algebraic manipulation, we find that 


beM, =(b—c)?+be>0, 
4abc*M, = 4c?(ia* — Lab) + ab(2 Lab — La?)>0, and 
(x + yy + 2)7?(z + x)PM3 = (Lxy) (Lx? — Uxy) (ax? + 3Lxy) 20 


with equality if and only if x = y = z, or equivalently a = b =. Here we simplify 
the calculation of M, by using the duality transformation [2] 


324 ADVANCED PROBLEMS AND SOLUTIONS [March 


a=ytz,b=Z2+x,c=x+y, 


where xX, y, Z are arbitrary non-negative numbers, not all zero. 
Inequality (2) follows from adding the following two inequalities found in 


[1, p. 110]: 


32R,R,2122Ur.r3;, DR? 242zr7. 
These inequalities are equalities if and only if ABC is equilateral and P is its center. 


1. O. Bottema et al., Geometric Inequalities, Noordhoff, Groningen, 1969. 
2. M.S. Klamkin, Duality in triangular inequalities, Ford Motor Company, preprint, July 1971. 


Also solved by G. V. Ferrer (Mexico), F. Leuenberger (Switzerland), Michael Goldberg, and 
the proposer. 


ADVANCED PROBLEMS 


All solutions of Advanced Problems should be sent to J. Barlaz, Rutgers — The State University, 
New Brunswick. N. J. 08903. Solutions of Advanced Problems in this issue should be typed (with 
double spacing) on separate, signed sheets and should be mailed before June 30, 1973. 
Contributors (in the United States) who desire acknowledgement of receipt of their solutions 
are asked to enclose self-addressed, stamped postcards. 


5883 [1972, 1042]. Proposed by Frank Bernhart, Kansas State University 


Correction. The .condition that set S be finite was inadvertently omitted from 
the hypothesis. 


5900. Proposed by E. R. Gentile, University of Buenos Aires, Argentina 


Let A and B be abelian groups (or modules over a principal ideal domain) such 
that A@ B is a nonzero free abelian group (module). Prove that A and B are free. 


5901. Proposed by E. D. Dixon, Tennessee Technological University 


If a and x are elements of a ring R we denote [a,x] = [a,x], = ax —xa and, in 
general, [a,[{a,x],|=[a,x],41 for all positive integers h. Show that if P is a 
polynomial with coefficients which are integers or coefficients which are in the center 
of R, then 

(1) [a, [P(a),x],] = (Pa), [a,x], and 

(2) if [a,x], = 0 then [ P(a), x], = 0. 


5902. Proposed by John H. Hubbard 


Integration is with respect to Lebesgue measure. Let X < R™ be a measurable set 
of finite measure. For any function f:R">R, call X; ={xeX: f(x) 290}, 
Xp ={xeX: f(x) SO}. 


1973] ADVANCED PROBLEMS AND SOLUTIONS 325 


Prove that for any whole number n 21, there exists a polynomial function 
p: R"—R of degree at most n such that for any polynomial function g: R" > R of 
degree at most n satisfying q(0) = 0, 


q = i q. 
Ie X5 


5903. Proposed by G. A. Heuer, Concordia College, and Albert Wilansky, 
Lehigh University 


If B is a two-dimensional noncommutative algebra over R (the real numbers), 
it is known that the multiplication in B is given by ab = f(a)b or by ab = f(b)a for 
some linear functional f on B. (Cf. Wilansky, Functional Analysis, problem 40, 
p. 258.) Is there a noncommutative multiplication in the set R? which, together with 
the usual vector addition makes R? into an associative ring which is not an algebra? 


5904. Proposed by S. Y. Chen and S. C. Hsieh, National Tsing Hua University, 
Taiwan 


Consider a semigroup S in which each pair of elements a, b satisfy aba = a. For 
such a in S let T, be the set of elements b such that ab = a. Show that if, for any a, 
T, contains more than half the elements of S then T, = S for all b in S. 


5905. Proposed by J. G. Wendel, University of Michigan 


Let x be a mpping of (0,1) into Euclidean space R‘ and let {u,} be a countable 
dense set of vectors on the unit sphere of R“. Suppose that for each n, lim sup,.o 
u, ‘ X(t) = 1. (a) Prove that lim sup,.,9 | x(t) | = 1. (b) Can the denseness of the set 
{u,} be dispensed with? (c) Is the result true in Hilbert space? 


SOLUTIONS OF ADVANCED PROBLEMS 
Generalized Bases in Cross Spaces 


5826 [1971, 1143]. Proposed by Richard Tapia, Rice University 


Let X be a Banach space with dual X*. Consider B = {(x,, y,):«#¢€A}o X x X* 
such that 


(1) <Xqs Vp» = Ogg (Kronecker delta). 
Arsove and Edwards call B a generalized basis if in addition to (1) we have 
(2) The linear span of {x,: «€ A} is dense in X, 


Davis calls B a dual generalized basis if in addition to (1) we have 


326 ADVANCED PROBLEMS AND SOLUTIONS [March 


(3) The linear span of {y,: «¢€ A} is dense in X*. 
Prove or disprove: these two notions of bases coincide in Hilbert space. 


Solution by E. M. Klein, University of Wisconsin — Milwaukee. We disprove 
the assertion. In the Hilbert space /? let x, = (0,0,---,0,1,0, ---) where the 1 is in the 
(n + 1)-th position and let y, =(1,0,---,0,1,0,---) where the second 1 is in the 
(n + 1)-th position. Since /* is its own dual, the system B = {(x,, y,):n =1,2, 3,--+} 
satisfies (1). Equation (2) is not satisfieds ince the vector (1,0, 0, 0, ---) is orthogonal] 
to all the x,. However, (3) is satisfied since if b = (b,, b,, b3, ---) is orthogonal to all 
the y,, then b, = — b, for n=2,3,--- and b, converges to 0, so b =(0,0,0,---). 
Hence B is a dual generalized basis but not a generalized basis, and 


B' = {(),» X,): n = 1,2,3,--+} 
is a generalized basis but not a dual generalized basis. 


Also solved by J. W. Evans, R. B. Israel, A. A. Jagers (Netherlands), and P. J. Owens (England). 


Automorphisms of p-Groups 


5827 [1971, 1144]. Proposed by Ronald Hirshon, Polytechnic Institute of 
Brooklyn : ) 


Let B bea finite p group, p >2. Let Z, the center of B, be cyclic with w a generator 
of Z. Does there exist an automorphism ¢ of B such that we=w/ for some j with 
j#1 mod p? If the answer is not always yes, will ¢ exist if we assume there is a 
maximal subgroup of B not containing Z? 


Editorial Note. No solution has been received for this problem. The proposer 
indicates that he has verified a ‘‘yes’’ answer to the first question for groups of order 
p” with n Ss S. In correspondence with the editors, J. L. Alperin says that the answer 
to the first question is ‘‘no’”’ for p = 7, and that there should exist (possibly tedious) 
counterexamples for all p > 2. Alperin also conjectures that the answer is ‘‘no’’ for 
the second question. Further contributions are invited. 


Representations by Permuting Countable Subsets of (0,1) 


5828 [1971, 1144]. Proposed by D. P. Giesy, Western Michigan University 


Let x e€(0,1). Is there an enumeration {q,} of the rationals in (0,1) such that 
Li 1 In /2" = x? (See Problem 5700 [1970, 1018], especially the Editorial Note.) 


5829 [1971, 1144]. Proposed by D. P. Giesy, Western Michigan University 


Q is a countable subset of (0,1). Find necessary and sufficient conditions on Q 
that it have the property: For every x €(0, 1) there exists an enumeration q,, q2,°°: 


1973] ADVANCED PROBLEMS AND SOLUTIONS 327 


of Q such that Lr, q,/2"= x. (See Problem 5700 [1970, 1018], especially the 
Editorial Note, also the preceding problem.) 


Solution by Neal Felsinger, Edgewood Arsenal, Maryland. It is obvious that 
inf Q = 0 and sup Q = 1 are necessary. We will show that they are also sufficient. 
Let {q,,} be an ordering of Q and suppose x € (0, 1) is given. We will define sequences 
{x hi21, (Yatneo such that y, = Dp-1 x;,/2", x,€Q and x>y,>x—27". We set 
Yo = 0. Suppose y, is defined and let m be the least positive integer such that q,, is 
not an x,, k <n. We consider three cases: 

(i) > Ya + Gm /2"** > x — 2-8): Merely let X41 = dmx Ynet = Vn t+ Iml2"* 4. 

(ii) y, + dm /2"+' =x: Let j be the unique integer such that y, + q,,/2/+! <x 
and y, + qm/2/ 2x. Let X;41 =», and let x,,n<k <j, be any element of Q not 
already chosen, less than x —(y, + q,,/27*1'). Then 


j 
Vyar = Ont Qml2274) + XL xy 2* <n + in 277") + (K—n + In 274) = x 


k=n+1 


and 
xi— 2-9) Sy, + dy, [24 — 2-9? = (Yq + di |22* 9) + (Qin —1) 127+) 
<n + Gin |22t* << Yj4t. 


(iii) y, + Gm/2"t! Sx —27%*"): Let j > n be the unique integer satisfying 


* J j+1 
y,+ LD 2-*sxandy,+ DY 27*>x. 
k=n+1 k=n+1 


For n<s <j, determine x, in Q, not already selected, so that 


j 
got-[e 2 24-20), 


k=n+1 


so that x, 2 1—2-%+t., Then 


j j 
x—-24ex-2WM ey 4 DY xy, + LT asx. 
k=n+1 k=n+1 
Thus x —2-4<y,<xandx—2°4*) <y,<y;+4q,,/2/**. Therefore either case 
(i) or case (ii) applies to y, depending on whether or not y; + q,,/2/** <x. 
Finally x =lim y, = 2,2, x,/2" and QO = {x,},24. 


Also solved by R. L. Enison, P. J. Owens (England), and the proposer. Problem 5828 only 
solved by G. J. Butler and by R. A. Struble. 


Editorial Note. An earlier solution to Problem 5828 was part of the solution to Problem 5700, 
given by the proposers of 5700, R. A. Struble and R. E. Chandler. Actually Struble had also proposed 
5828 independently. 


328 ADVANCED PROBLEMS AND SOLUTIONS [March 


Function with a Natural Boundary 
5830 [1971, 1144]. Proposed by Leonard Carlitz and R. A. Scoville, Duke 
University 


Let « be a positive irrational number and put 
[o@) 
gz) = X im, 
n=1 


where [an] denotes the greatest integer < an. Show that ¢(z) has the unit circle for 
a natural boundary. 


Solution by C. C. Rousseau, Memphis State University. A theorem due to Szegi 
(Dienes, The Taylor Series, p. 324) states that if 


f(2) = ES az! 
k=0 


and if {a,} contains only finitely many distinct numbers, then f(z) has the unit circle 
as a natural boundary unless a,,,=.d;, for all sufficiently large k, in which case f(z) 
is the rational function, P(z)/(1 — 2”). 

Writing #(z) as a power series 


00 
o(z) = p> a,2", 
k=0 


where a, is the number of integers h 2 1 such that [an] = k, we see that, for fixed «, 
{a,} contains at most finitely many distinct numbers. ‘Thus, we need only show that 
if « is irrational, {a,} cannot be periodic. 
Assume that {a,} is periodic, i.e., that there exists p such that a,,, = 4; for all 

sufficiently large k. Using the assumption of periodicity and setting 

p-1 

q= Lb a +f 

j=0 
it follows that if n, is the smallest integer n such that [an] =k, then n, + mq is the 
smallest integer n such that [an] = k + mp. Hence, we write 


an, =k + B, and a(n, + mq) =k + mp + Bus mps 


where 0 S By 4mp < & (m = 0,1,---). From the difference of these two expressions we 
obtain 


Bi+mp = By + mq(a — p/q). 


It follows that 0S By +.» < «% cannot be satisfied for all m unless « = p/q. Hence, if « 
is irrational, {a,} ‘cannot be periodic. 


1973] ADVANCED PROBLEMS AND SOLUTIONS 329 


Also solved by L. W. Carroll, L. Kuipers, and the proposers. 

Carroll’s proof is immediate from a theorem of Hecke on the fact that, for # irrational, 
XC + 1) B] z* has a natural boundary. Kuiper’s proof is based on Problem 168 in Vol. 1 of Polya- 
Szeg6’s Aufgaben, a result equivalent to Hecke’s theorem. 


Convex Sets with Nonempty Interiors 


5831 [1971, 1144]. Proposed by Albert Wilansky, Lehigh University 


Let C be a convex closed set in a normed space such that C + D, > D,,,. Must 
C have nonempty interior? (Here D, = {x:||x| <r}, e>0.) 


Solution by R. B, Israel, University of Chicago. C must have nonempty interior 
and, in fact, we must have D, c C. For real numbers r we define rC = {rc: ce C}. 
By the convexity of C it is easy to see that for positive r and s we have rC +sC 
=(r+s)C. Now we have D,,,< D, + C, so | 


Da+se2 =A + 8)Di4.¢ Dizet(LtaC oD, +(1+(1+2e))C. 
By iterating n times, we find that 


Daren < Dasen-t $+ (1 +8)" Cc... 


n~1 
<Dy+ & 1+efC=D, +e "(1 +28)" — IC, 
j=0 : 
and, letting (1 + e)"=r, we have D, <D,,,+ (1 — 1/r)C. Let x be any member of 
D,. Then for each n there are y, and z, such that x = y, +z, with y,¢D,,, and 
r(r—1)7*z,EC. But as no, ro, so y,70 and z,—+x. Moreover, 
r(r — 1)~1z, +x, and since C is closed we must have xeC. Thus D, <C. 


Also solved by P. R. Chernoff, Moshe Feder & Simeon Reich (Israel), John Horvath, R. M. 
Koch, L. E. Mattics, P. J. Owens (England), and the proposer. 


Notes. (1) The proposer states that the problem arises in providing an elementary proof that a 
map between Banach spaces is almost open. This question arises in determining a sufficient 
condition for such a map to be onto. See Bade and.-Curtiss, Pacific J. of Math. 18(1966), pp. 391, ff, 
Theorem 1.2; also Kaufman, Proceedings of the A. M. S,, 17 (1966), pp. 767-768. 

(2). Koch rephrases the problem and proves: Let C be a convex setina topological vector space 
such that C + U > (1 + €) U, where U is a bounded neighborhood of 0, e > 0. Then the closure of C 
is a neighborhood of C, actually C > &U. 


THE AMERICAN 


MATHEMATICAL MONTHLY 


(FOUNDED IN 1894 BY BENJAMIN F. FINKEL) 
THE OFFICIAL JOURNAL OF 


THE MATHEMATICAL ASSOCIATION OF AMERICA 


VOLUME 80 
CONTENTS 
A Unified Theory of Integration .. - oe ele CEL J. MCSHANE 
Highlights in the History of Spectral Theory . re L. A. STEEN 
The Legend of John von Neumann... . . . . . P. R. HALMOs 
Alternating Euler Paths for Packings and Covers . oe. )6hUCL UT. ZAHN, JR. 
MATHEMATICAL NOTES 
Stable Laws and the Imbedding of L? Spaces. . . . . = . . MAREK KANTER 
A Convex Matrix Function .. . . . . .  . M. H. Moore 
Solution of Fejes Téth’s Illumination Problem . oo. . . . .  B. R. HENRY 
A Covering Theorem .. . .  . J.C. KIEFFER 
Distributivity over the Dirichlet Product and C ompletely Multiplicative ; 
Arithmetical Functions . . . . we CORR ICC LANGFORD 
Perfect Parallelograms . . . . . 2.) . OR OW. OSTELAFE 
A Crowded Set of Non-intersecting Lines. . . . . . . . J. A. Erpswick 
RESEARCH PROBLEMS 
A Deception Game... . . . . CJL SPENCER 
CLASSROOM NOTES 
Traffic Flow: Laplace Transforms . . .  E. A. BENDER AND L. P. NEUWIRTH 
Irrational Numbers. .. .  .  .J. P. JONES AND S. TOPOROWSKI 
A Simple Proof of the Formula re ke 7 =77/6. . .  ITOANNIS PAPADIMITRIOU 
Another Elementary Proof of Euler’s Formula for ((2n) . . . T. M. APposTOL 


(Continued on inside cover) 


APRIL 


NUMBER 4 


CODEN: AMMYAE 


349 
359 
382 
395 


403 
408 
409 
410 


411 
414 
415 


416 


417 
423 
424 
425 


1973 


MATHEMATICAL EDUCATION 
An Integrated Sequence in the Mathematical Sciences for Undergraduate 


Business Students . . . . . . . R.H. RANDLES AND A. J. SCHAEFFER 
ELEMENTARY PROBLEMS AND SOLUTIONS . 


ADVANCED PROBLEMS AND SOLUTIONS 

REVIEWS ; 

NEws AND NOTICES . ee 

MATHEMATICAL ASSOCIATION OF AMERICA 
October Meeting of the North Central Section 
Officers and Committees as of February 1, 1973. 
Calendars of Future Meetings 


431 
433 
440 
447 
466 
467 
467 
468 
474 


NOTICE TO AUTHORS 


Specialized research is usually unsuitable; see Statement of Policy (vol. 76, p. 2). Manuscript preparation: Please 
use the Manual for Monthly Authors (vol. 78, p. 1) and follow the format in current issues of the MONTHLY. 
Manuscripts should be typewritten, triple-spaced with wide margins; submit two copies and keep one for 


protection against loss. 


Backlog: Main Articles 12 months, Math. Notes 15 months, Research Problems 7 months, Classroom Notes 


11 months, Math. Education 10 months. 


EDITORIAL CORRESPONDENCE AND MAIN ARTICLES: to ALEX ROSENBERG, Department of Mathe- 
matics, Cornell University, Ithaca, N.Y. 14850; NOTES, etc.: to the corresponding Associate Editor: 
ADVERTISING CORRESPONDENCE: to Raout HAILPERN, Mathematical Association of America, 
SUNY at Buffalo, Buffalo, N. Y. 14214; CHANGE OF ADDRESS and SUBSCRIPTIONS: to A. B. 
WIiLLcox, Mathematical Association of America, 1225 Connecticut Ave., N.W., Washington, D.C. 20036. 


HARLEY FLANDERS, £ditor 
ALEX ROSENBERG, Editor-Elect 
ASSOCIATE EDITORS 


JOSHUA BARLAZ J. G. HARVEY SEYMOUR SCHUSTER 

E. R. BERLEKAMP ERIC S. LANGFORD J. ARTHUR SEEBACH, Jr. 
JANE W. DI PAOLA P. D. LAX E, P. STARKE 

ROBERT GILMER ARTHUR MATTUCK LYNN A. STEEN 
RICHARD GUY M. W. POWNALL JAMES WENDEL 

RAOUL HAILPERN GIAN-CARLO ROTA 


Annual dues for members of the Association (including a subscription to the American 
Mathematical Monthly) are $12.50. For nonmembers the subscription price is $18.00. 


PUBLISHED BY THE ASSOCIATION at Washington, D. C., and Menasha, Wisconsin, during the months of January 


February, March, April, May, June-July, August-September, October, November, December. 
Second-class postage paid at Washington, D. C., and additional mailing offices. 
Copyright © The Mathematical Association of America (Incorporated), 1973 


PRINTED IN THE UNITED STATES OF AMERICA 


A UNIFIED THEORY OF INTEGRATION 
E. J. MCSHANE, University of Virginia 


1. Introduction. An undergraduate student of mathematics, or science, or engi- 
neering, 1s usually introduced to an assortment of integrals. First he meets the one- 
dimensional Darboux integral, defined in terms of the elementary integrals of step- 
functions above and below the integrand. By the time he finishes a more advanced 
course in calculus he has met multiple integrals, several kinds of ‘‘improper’’ inte- 
grals, line integrals, etc. But, beyond this, in mathematics or theoretical physics or 
engineering science, it is important to know (or at least be willing to believe in!) 
the Lebesgue integral. Unfortunately, few students of science or engineering can 
afford the time to start afresh with a new theory of integration, especially since it 
appears abstract and disconnected from all that they have previously learned. Even 
students of mathematics not specializing in analysis often fail to develop any ease 
in using the Lebesgue integral. (More than once I have found graduate students 
unable to answer the question ‘‘what is the Lebesgue integral from 0 to 1 of x*dx?’’) 

Since this lack of unity in integration theory is an inconvenience to all sorts of 
students and hinders the access of scientists to parts of mathematics that are important 
in theoretical physics and engineering science, I am enthusiastic about promoting 
the use of a theory of integration in which there is in effect only one definition of 
the integral. This is presented in successively more general settings, but each change 
is merely a rather simple amendment to take in more territory. We never have to 
abandon it in favor of the Lebesgue integral, because it is equivalent to the Lebesgue 
integral. For use in teaching, the crucial point is to show that this theory can be 
smoothly fitted on to existing classroom methods at any point, but preferably just 
after the student has finished a beginning course in which the Darboux integral has 
been defined, and its importance has been demonstrated by examples, and the student 
has acquired some facility in its use by exercises. The following pages, therefore, 
will be devoted to showing how the integral can be presented to such a beginner 
and can be extended step by step to satisfy all the requirements of undergraduate 


E. J. McShane received his Ph.D. from the University of Chicago in 1930. His dissertation, on 
existence theorems for isoperimetric problems in the calculus of variations, was written under the 
direction of Gilbert Ames Bliss. He has been a National Research Council Fellow (1930-32); a 
Hilfsassistent at Gdttingen; on the Princeton faculty, 1933-35; and since then a professor at the 
University of Virginia, with some interruptions — the war years at the Ballistic Research Laboratory 
in the Aberdeen Proving Ground, a year at the Institute for Advanced Study, a year at the University 
of Utrecht, a year at the Rockefeller University, a semester at the University of Kyoto. He is a 
member of the National Academy of Sciences and of the American Philosophical Society, and has 
been president of the MAA and of the AMS. The Association has awarded him the Chauvenet Prize 
and the Award for Distinguished Service. He has published numerous papers, mostly on calculus of 
variations, integration theory, control theory and stochastic processes; also several books, including 
one entitled Integration which this paper is intended to sabotage. (Submitted by author.) 


349 


350 E. J. MCSHANE [April 


mathematics, without asking too much of reasonably capable undergraduates. It 
has already been shown (McShane [1]) that the same type of definition can be used, 
and deep theorems proved, in settings of great generality. Our concern here is with 
the quite different, and in my opinion much more important, problem of making 
modern integration theory painlessly available to a large body of students. 

We therefore suppose that we are facing a group of students who know the def- 
inition of the integral of a step-function on an interval in one-dimensional space 
R', and who know that the (Darboux) integral 


| : f(x)dx 


is the unique number J which is the infimum of the integrals of step-functions S = f 
and the supremum of the integrals of step-functions s < f, provided that there is 
such a unique J. In this whole presentation we shall be forced to omit a great deal 
and condense the rest much more than would be appropriate in a classroom, so that 
we ask much of the reader in filling in details and even believing unsupported as- 
sertions. The full presentation will be in a book whose manuscript is still less than 


half finished. 


2. Transition from the Darboux integral. In one-dimensional space R‘ a neigh- 
borhood of a point x will mean an open interval that contains x. We shall often 
make use of functions 6 defined on some set E in R' (or, later, in some other space) 
such that for each x in E, 6(X) is a neighborhood of x. Such a neighborhood-valued 
function will be called a gauge on E. For example, we can phrase the definition of 
continuity thus. A real-valued function f on E is continuous on E if for each positive 
é there is a gauge 6 such that whenever XE FE and xeEE NM O(x), it is true that 
|f) -£@)| <e. 

We introduce two more expressions. A partition of the right-closed interval 
(a, b] will be a finite set P = {(%,, A,), ---,(X,,4,)} of pairs in which each x; is a 
point of the closed interval [a,b], each A; is a right-closed subinterval of (a, b], 
and each point of (a, b| belongs to exactly one of the A;. The partition P, or more 
generally, any collection of pairs (X;, A;), is said to be 6-fine if for each i, the interval 
A; is contained in the neighborhood 6(x;) of the point x;. 

Suppose now that f is defined and continuous on a closed interval [a,b]. Let ¢ 
be positive. There is a gauge 6 on [a,b] such that if x e[a,b] and xe[a,b] N6(x) 
then | f(x) — f(®)| < ¢/2(b— a). Let P = {(X,,A;),--*,(X,,4,)} be any 6-fine par- 
tition of (a,b]; for the moment we merely assume that such partitions exist. If xis 
any point in the interval A;, x is in 6(X;) because A, < 0(X;). So, by the choice of 
gauge 6, we have 


(2.1) F(X;) — €/2(b—a) < f(x) <f(X,) + e/2(b—a). 


We now define two step-functions s, S on (a, b] by setting 


1973] A UNIFIED THEORY OF INTEGRATION 351 


s(x) = f(%,) — e/2(b—a), S(x) = f(X,) + e/2(b—a) 


for all x in A; (i = 1,---,k). Then by (2.1) we have s<f<S. If we use the symbol 
mA, to denote the length of A;, by elementary calculus we have 


(2.2) [ S(x)dx = 3 { f(X,) + ¢/2(b —a)}mA, 
a i=1 
= z f(®)mA, + [e/2(b—a)] LmA; 
= y f(%)mA, + ¢/2, 
and similarly 
b k 
(2.3) | s(x)dx = 2 f(%)mA, — &/2. 


Therefore there are step-functions s, S such that s<f<S and the integrals of s 
and S differ by the arbitrarily small number ¢. By elementary calculus, f has an 
integral; we denote it by J. Moreover, J is between the integrals of s and S. But 
by (2.2) and (2.3) this implies that 


k k 
x f(%)mA,;-@/2 SIS LD f(X)mA, + 2/2, 
i=1 i=1 
whence 
k 
(2.4) vo f(%)mA,-J | <e. 
i=1 


We now have exhibited a new procedure for identifying the Darboux integral 
J of the continuous function f. It is the (unique) number J such that to each positive 
é there corresponds at least one gauge 6 such that for every 6-fine partition 
P = {(X1, Aj), +++, (X,, 4,)} of (a, b], inequality (2.4) is satisfied. Since this procedure 
always gives back the known Darboux integral for continuous f, we have the privilege 
of using it as the definition of the integral for such f. But the same procedure does 
much more. It picks out one number J (by inequality (2.4)) for many discontinuous 
and even unbounded f, for which no Darboux integral exists. When this happens, 
we still apply the name ‘“‘integral’’ to the number J that the procedure selects for 


us, thus. 


DEFINITION 2.1. A real-valued function f is said to be integrable over a right- 
closed interval (a, b] if it is defined on the closure [a,b], and there isa number J 
such that: to each positive ¢ there corresponds a gauge 6 such that for every 6-fine 
partition P = {(X,, A4), +++, (X,,4,)} of (a,b] it is true that 


352 E. J. MCSHANE [April 
k 
x f(x)mA,—-J | <e. 
i=1 


In this case we define 
b 
| f(x)dx = J. 


In the theory of the Darboux (or Riemann) integral, to prove the integrability 
of a function on [a,b] we have to show the existence of step-functions s and S like 
those of the preceding paragraph. This can be accomplished by proving and using 
the Heine-Borel theorem. In order to develop the theory of the integral defined in 
Definition 2.1, we need to know that for every gauge 6 on [a,b] there exist 6-fine 
partitions of [a,b]—even though most undergraduates will falsely regard this as 
evident. The proof is almost identical with that of the Heine-Borel theorem. 

It is easy to prove that step-functions are integrable, their integrals being given 
by the familiar formula. From this we deduce that every Darboux-integrable func- 
tion (f(x): a S$ x S b) is also integrable by Definition 2.1, the integrals being the 
same. So we have not abandoned the Darboux integral; we have generalized it. 
But the generalization is considerable. For example, if f(x) = x * for x >0 and 
f(x) = 0 for x $ 0, by exhibiting a gauge 6 for each positive ¢ we can show that 


b 
(2.5) [ f(x)dx = 2b? (b>0). 


This f is not Darboux integrable, since there is no step-function above f. Equation 
(2.5) is more easily established after we prove the monotone convergence lemma 
(§3), which will in fact enable us to prove that all the ‘‘absolutely convergent im- 
proper integrals’? of advanced calculus are covered by Definition 2.1, with no 
‘impropriety’ at all. 

Definition 2.1 specifies the integral as a limit, in a known generalized formulation 
of limit-theory (see, for example, [2], pp. 7-29). So uniqueness, the Cauchy criterion, 
etc., all hold. But it is probably better for normal advanced-calculus students just 
to prove these statements without wandering off into abstract limit theory. 

Some elementary texts use Riemann’s definition instead of that of Darboux. 
The last equation in Definition 2.1 is defined to mean that to each positive e there 
corresponds a positive constant 6 such that for every partition {(X,,A,), ---,(X,,4,)} 
with each X; in A, and each A; with length less than 6, it is true that 


k 
x f(%)m4,-—J | <e. 
i= 1 


A student familiar with this definition should be able to make the transition to 
Definition 2.1 with ease. But any student mature enough to master Riemann’s def- 


1973] A UNIFIED THEORY OF INTEGRATION 353 


inition should be able to comprehend Definition 2.1, and he would profit by re- 
placing Riemann’s definition by Definition 2.1 in the first place. 


3. Absolute integrability and monotone convergence. The integral defined in (5) 
has the same elementary properties as the Riemann integral: it is linear in f; 
it is non-negative when fis non-negative; fis integrable over (c, d] when it is integrable 
over an interval (a,b]| that contains (c,d]; it is integrable over (a,b] when it is 
integrable over each of two disjoint intervals (a,c],(c,b] whose union is (a,b], 
and then 


[ seoar _ [ teoax + [foods 


and the familiar differentiability properties hold. 

Because Definition 2.1 closely resembles the definition of the Riemann integral, 
the customary proofs can be carried over with little change, and there is no point 
in dwelling on them. However, we shall give condensed proofs of two statements 
absent from the elementary theory. 

Corresponding to a partition P = {(X,,A,), --,(X,,4)} of (a,b] and a function 
f on [a,b] the sum 


M = 


I(X)mA; 


1 


occurs frequently. It is convenient to call it a “‘Riemann sum’’ and to assign it a 
symbol S(P;/). 

We now prove that if f is integrable over (a,b]| it has a property that I have 
named ‘‘absolute integrability’’. 


THEOREM 3.1. Let f be integrable over (a,b] and let J denote its integral. Let 

& be positive, and let 6 be a gauge on [a,b] such that for every 6-fine partition P 
of (a, b], | S(P; f) ~ J| <e. Then for every two 6-fine partitions 

P' = {(x4, 1)» “(hy A,)}, 

PY’ = {(x1, Ay 525 (Xz, Ay)} 
of (a,b], it is true that 

h ok 
2 2 | f(x) — f(D| m4; A Aj) < 2¢. 


Every point of (a,b] belongs to exactly one of the intersections A; Aj, 
where (as throughout this proof) i ranges over {1,---,h} and j over {1,---,k}. Define 
Xi, = xX; and x;, = x; if f(x;) 2 f(x,); otherwise define x;,; = x; and x;, = x;. In 
either case, 


(3.1) fi) foi) = |f) -f@)]. 


354 E. J. MCSHANE [April 


Both sets of pairs 
P” = {(x;j, A; VA}, PY = {(xi, 47 0 AD} 


are 6-fine partitions of (a,b]. So S(P”;f) and S(P'";f) differ from J by less than e, 
and therefore differ from each other by less than 2¢. That is, 


hk 
| 2 % tf ay) — fx)}m(4; 0 45) | <2e. 


By (3.1) this is the conclusion of Theorem 3.1. 

From this and the Cauchy criterion for existence of the integral, we can prove 
that if f is integrable over (a,b], and L is a function defined on the set of values 
of f and satisfying | L(y,) - L(y2)| < K| V7 y2| (K a constant) for all y, and y, 
that are values of f, then (L(f(x)): a S$ x S b) is integrable over (a, b]. As corol- 
laries, if f, f, and f, are integrable over (a,b], and f, and f, are bounded, the fol- 
lowing functions are integrable: f* = max(f,0), f~ = max(—/,0), | f | , St 
(n = 1,2,3,---), | fs|" (@>0), fifo, and 1/f if f is bounded away from 0. 

All the theorems mentioned so far have been valid for the Riemann (or Darboux) 
integral also. They apply to a larger class of functions; but if this were the only gain, 
it would not be enough to justify the introduction of Definition 2.1 into a calculus 
or advanced-calculus course. But now we shall prove a fundamentally important 
statement that has no analogue in the theory of the Darboux integral, namely a 
slightly restricted form of the monotone convergence theorem. In the proof we 
Shall need the following rather easy consequence of Theorem 3.1. 


THEOREM 3.2. Let f, 6 and « be as in Theorem 3.1. Let {(X,,A,),°-+,(X,,4,)} 
be a 6-fine set of pairs such that the A, are pairwise disjoint right-closed subinter- 
vals of (a,b] (the union of the A; need not be all of (a,b]). Then 


k 
>> | f(xpmA; — i) f(x)dx_| S 2¢. 
i=1 Ai 


THEOREM 3.3 (Monotone convergence lemma). Let f,,f,,/3,--- be a sequence of 
real-valued functions all defined on [a,b] and integrable over (a,b|. Assume 
that for each x in [a,b], fi(x) S f.(x) S fa(x) S ---, and that lim,..,,f,(x) (which 
necessarily exists, finite or infinite) is actually finite. Then the limit-function, 
whose value at x is lim,f,(x), is integrable over (a,b| if and only if the limit 


b 
J = lim SF Ax)dx 
nao a 
is finite; and in that case, 


‘Tim f(x) |dx = J. 


a 


1973] A UNIFIED THEORY OF INTEGRATION 355 


Let g denote lim, f,,. There is no loss of generality in assuming f, 2 0. Suppose 
J < @ (the ‘‘only if’ part is trivial), and let e be positive. We must exhibit a gauge 
6 on [a,b] such that for every 6-fine partition P of (a, b], 


(3.2) J—e<S(Pig)<Jte. 
Define 
(3.3) J, = [ soar. 


For each n there is a gauge 6, such that if P’ is a 6,-fine partition of (a, b] then 
(3.4) | SPSS.) ~ In| < 0/2"*?; 

in addition, we may suppose 6,(x) > 6,(x) > 63(x) > --- for all x in [a,b]. Define 
E,, to be the set of all x in [a,b] such that 

(3.5) f(x) 2 [2S + 8)/(2J + 28)]9(x). 


Then E, cE,c E,c ---. If g(x) = 0, xe E,, because then f,(x) =0. If g(x) > 0 
the right member of (3.5) is less than the limit g(x) of f(x), so (3.5) holds for all 
large n. Hence U E, = [a,b]. 

The numbers J,,J.2,--- are non-decreasing and tend to J, so we can and do 
choose an N such that J 2 Jy > J — é/2. Every x in [a,b] belongs to exactly one 
of the sets 


Dy = Ey, Dy+1 = Ene \ Ens +*'sDm = Em \ Em-15 °° 
If xeD,, we define 6(X) = 6,(X). Then 6 is a gauge on [a,b]. Let 
P = {(X1, Ay), +++) (Xq, Ay)} 
be any 6-fine partition of [a,b]. Then P is dy-fine, so 
S(P3g) 2 S(P3 fn) 
> Jy —0f2%*? , 


and the first inequality in (3.2) holds. 

For the other, let I(n) be the set of those i in {1,---,k} for which x,eD, 
(n = N,N +1,---). There is a largest n, say n*, for which I(n) is not empty. By 
(3.5) and Theorem 3.2, 


3 x g(x,)mA, 


n=N iel(n) 


S(P; 9) 


(3.6) < ¥ { LD [QJ + 20)/(2I + 8) f(%)mA;} 


n=N_ ielI(n) 


356 E. J. MCSHANE [April 


IIA 


[(2J + 2e)/(2J + e)] 5 { x } n(x + 26/2"*7} 


n=N- iel(n) 


S$ [J + 2e)/(2J + 6)] [ = Xu x a (x)dx + z Eola 
=N iel(n) 

The union of the A;, as i ranges over I(n) and n ranges over N,---,n*, is (a, b]. 

Also, 


S /2"t? < Zep — e/2. 


n=N 


Therefore inequality (3.6) yields 


S(P;g) < [(2J + 28)/(2J + )] [ “fda + e12| 


S [QJ + 2e)/(2J + €)] {J + ¢/2} 
= J+6é. 


So the second inequality in (3.2) holds, and the proof is complete. 

The crucial idea of this proof is due to R. Henstock, who proved the analogous 
theorem for the ‘‘Riemann-complete’’ integral. 

If f is non-negative and continuous on [a,b] except at finitely many points 
xi, --:,x,, we can define f,(x) to be 0 if x has distance less than 1/n from some 
x* and to be f(x) otherwise. From Theorem 3.3 we obtain the usual theory of the 
‘‘improper’’ integral of f, which for us is merely the integral of f. By considering 
f* and f~ we obtain the theory of ‘‘absolutely convergent improper integrals’’. 


4. Multidimensional integrals. Next, let a right-closed interval (a,b] in R” be 
defined to be the cartesian product of n right-closed intervals (a, b®] in R’, 
and let open intervals (a,b) and closed intervals [a,b] be analogously defined. 
With this notation for (a,b] we define m(a,b] = (b&? — a”).--(b™ — a). Thus 
m(a, b] is length if n = 1, area if n = 2, and volume if n = 3. If f is real-valued 
on [a,b] < R", the definition of 


I (x)dx 
(a,b) 
is the same as Definition 2.1. All theorems generalize at once, except those involving 
differentiation. A new concept appears, namely integration by iteration. For this 
we can easily prove a form of Fubini’s theorem adequate for the needs of advanced 
calculus. For simplicity we consider n = 2 and write (u,v) as another symbol for 
x = (x, x) If A = (a,b] x (c,d] and fis a step-function on A~ = [a,b] x [c,d], 
it is trivial that 


1973] A UNIFIED THEORY OF INTEGRATION 357 


(4.1) | sax = [ [oo dv. 


Suppose that f is bounded on A , and that there is a closed subset F of A™ 
such that f is continuous on A’ \F and f as restricted to F is continuous on F. 
We first make the additional hypothesis that f(x) < 0 on F and f(x) 2 0onA \F. 
For j = 1,2,3,--- we subdivide A~ into 4/ subintervals by dividing each side of A~ 
into 2’ intervals of equal length, and we define the step-function s ; by giving it on 
each subinterval the infimum of values of f on that subinterval. Then s,<s, <---, 
and (4.1) holds for each s,;. By several applications of the monotone convergence 
lemma we find that (4.1) holds for f. The additional hypothesis is easily removed. 
Moreover, the u and v need not be points of R!. The theorem holds for x, u, v in 
R",R?’, R* respectively, where p+q =n. 

If B is any set in any space, we define 1, to be the characteristic function, or 
indicator function, of B; its value is 1 at all points of B and 0 at all other points of 
the space. If 1, is integrable over R”, we say that B has finite measure, and we define 


(4.2) mB = 1,(x)dx. 


Technically, we have everything needed for a theory of measure. But the students 
lack the necessary mathematical maturity, and it is better to postpone measure 
theory to a later stage. However, (4.2) expresses the classical formula for ‘‘areas 
by double integration’’ and ‘‘volumes by triple integration,’’ and (4.1) allows us 
to solve all the text-book problems by iterated integration. 


5. Integration over unbounded sets. For the next extension we form R! by ad- 
joining + oo and —oo to the real system R‘. These have the expected algebraic 
and order properties; we take 


0-coo = 0°(—o) = 0. 


The neighborhoods of oo are the half-lines {x e R!: x > a} for all a in R'; the neigh- 
borhoods of —oo are the half-lines {xe R!: x <a}. Also, we include the half- 
lines {xe R1: x < b} and {xe R!: x > a} and R’ itself among the right-closed inter- 
vals in R‘, and for each of these unbounded intervals A we define mA = oo. If f 
is defined on some subset of R', f(00) and f(— 0) are undefined. We give them the 
value 0. Now we can define integrals over unbounded intervals by Definition 2.1 
without change. All theorems extend with at most minor changes; and the monotone 
convergence lemma provides us with the theory of ‘‘absolutely convergent improper 
integrals’’ over intervals (a, 00), (~—00,b) and (— 00, ©). Again these integrals are 
not at all ‘‘improper’’. The limit process which in the customary calculus text defines 
the integral is for us merely a convenient device for computing an integral already 
defined and studied. 


358 E. J. MCSHANE [April 


For brevity we omit discussion of three types of integral usually discussed in 
advanced calculus texts. These are integrals of vector-valued functions, to which 
Definition 2.1 applies if we understand the symbol | | to denote the length of a 
vector; line integrals; and surface integrals, about which we have nothing significant 
to Say. 

At this stage the use of our integral in place of the Darboux integral has caused 
very little change in factual information or in the mental effort required to make 
proofs. But the student who has been familiarizing himself with the integral of Def- 
inition 2.1 now has the background material for further advances. Also, he will 
not be subjected to the trauma of being told to discard the integral with which 
he is familiar and change to the Lebesgue integral, for the integral of Definition 2.1 
is the Lebesgue integral. 


6. Probability measures. The use of the integral of §2 is particularly helpful 
to students in advanced undergraduate probability courses. Such courses demand 
more than the Darboux integral can provide. In many texts, discrete distributions 
are well treated; but the general case requires countably additive measures on o-al- 
gebras of sets, and authors are caught between the dangers of overloading the students 
with measure theory and overloading the book with unproved (and sometimes 
untrue) assertions. Our integral removes this trouble. 

We begin with an elementary probability measure P defined and non-negative 
on all right-closed intervals in R”, such that P(R”) = 1, and if a right-closed interval 
A is the union of pairwise disjoint right-closed intervals A,,---,A, then 
P(A) = P(A,)+-:- + P(A,). We also add the regularity requirement that for 
each right-closed interval A and each positive ¢, there is a right-closed interval B 
whose closure is contained in A for which P(B) > P(A) —«. If we substitute P 
for m in Definition 2.1 we obtain the definition of 


I. f(x)P(dx). 


When A = R” and the integral exists it is called the expectation of f, and often 
denoted by E(f). In particular, if C is a subset of R” whose indicator function 1¢ 
is integrable, C is called an event and assigned the probability measure 


P(C) = [. Ac(x)P(dx). 


The fundamental theorems proved for integrals over intervals in R” carry over 
with only trivial change to this integral (differentiation theorems and the Fubini 
theorem do not), and so does the monotone convergence lemma. It follows readily 
that if C,,C.,--- is a countable set of events, their union is an event; and if they 


are pairwise disjoint, then 
P(U,C;) = &X,P(C)). 


1973] HIGHLIGHTS IN THE HISTORY OF SPECTRAL THEORY 359 


Thus our integral serves all the needs of integration with respect to distributions 
in finite-dimensional spaces. 

However, such processes as infinite sequences of tosses of a coin occur in fairly 
elementary probability theory, and at a more advanced level we meet stochastic 
processes, or random functions. An infinite sequence of real numbers is a point of 
R*, where Z = {1,2,3,---}; a function on a set Tis a point of R™. So the prob- 
ability theory of such processes calls for distributions and integration over infinite- 
dimensional spaces R? (where T may be Z). If we define intervals and neighborhoods 
in R* in the manner that has long been customary in topology, we find that Def- 
inition 2.1 applies to this case also. 

By now it should be clear that the only thing keeping us from going on and on 
with the full development of the Lebesgue integration theory is the fact that we 
have reached the end of our program of fitting the theory into undergraduate in- 
struction (and, perhaps, of the editor’s patience). Anything that can be proved 
about the Lebesgue integral can be proved about this integral, because it is the 
Lebesgue integral. And for those students, such as engineers, who lack time or 
inclination to work through detailed proofs, we are at least asking them to believe 
unproved statements about an integral they know, rather than about an integral 
whose very definition is unfamiliar to them. 


This paper is an expansion of an address given to the Louisiana- Mississippi Section of the MAA 
on February 18, 1972. 


References 


1. E.J. McShane, A Riemann-type integral that includes Lebesgue-Stieltjes, Bochner and 
stochastic integrals, American Mathematical Society, Memoir 88 (1969). 

2. Studies in Modern Analysis, M.A.A. Studies in Mathematics, Vol. 1 (R.C. Buck, editor), 
1962. 


HIGHLIGHTS IN THE HISTORY OF SPECTRAL THEORY 
L. A. STEEN, Saint Olaf College 


Not least because such different objects as atoms, operators and algebras all 
possess spectra, the evolution of spectral theory is one of the most informative 
chapters in the history of contemporary mathematics. The central thrust of the 
modern spectral theorem is that certain linear operators on infinite dimensional 


Lynn Steen received his MIT Ph.D under Kenneth Hoffman. Since then he has been on the 
staff at St. Olaf College except for a year’s leave at the Mittag-Leffler Institute ona N.S.F. Fellowship. 
He is an Associate Editor of the MONTHLY and his publications include Counterexamples in Topology 
(with J. A. Seebach, Jr., Holt Rinehart & Winston, 1970). Editor. 


360 L. A. STEEN [April 


spaces can be represented in a “‘diagonal’’ form. At the beginning of the twentieth 
century neither this spectral theorem nor the word ‘“‘spectrum”’ itself had entered the 
mathematician’s repertoire. Thus, although it has deep roots in the past, the 
mathematical theory of spectra is a distinctly twentieth century phenomenon. 

Today every student of mathematics encounters the spectral theorem not later 
than his first course in functional analysis and often as early as his first course in 
linear algebra. Usually he studies one specimen of the spectral theorem, plucked 
out of historical context and imbedded in the logical context of his particular course. 
Although this scheme is pedagogically efficient and logically aesthetic, it does often 
obscure the fact that the spectral theorem was (and perhaps still is) an evolving 
species. Its evolution is an outstanding example of the counterpoint between pure 
and applied mathematics, for while the motive force in its evolution was the attempt 
to provide adequate mathematical theories for various physical phenomena, the 
forms through which it evolved are precisely those which have marked the development 
of modern abstract analysis. 

So we offer here an austere outline of the evolution of the spectral theorem as a 
microcosmic example of the history of twentieth century mathematics. To understand 
the significance of contemporary achievements and to recognize their continuity with 
the past, we begin with the principal historical roots of our subject. 


1. Principal axes theorem. The only theorem available at the turn of the 
twentieth century which we can with hindsight recognize as a direct forerunner of the 
modern spectral theorem is the principal axes theorem of analytical geometry. It 
should not be surprising that the simplest form of this theorem is contained in the 
writings of the founders of analytical geometry, Pierre de Fermat (1601-1665) and 
René Descartes (1596-1650). For the Euclidean plane R?, this theorem says that a 
quadratic form ax* + 2bxy + cy? can be transformed by a rotation of the plane into 
the normal form ax? + By”, where the principal axes of the normal form coincide 
with the new coordinate axes. The essential content of this theorem—that the al- 
gebraic reduction to normal form corresponds to the geometric rotation onto 
principal axes—is contained in Descartes’ La Géométrie [1637], and was known at 
about the same time by Fermat but not published until after his death [1679 |. The 
term ‘“‘principal axes’’ was introduced by Leonhard Euler (1707-1783) in his in- 
vestigation of the mechanics of rotating bodies [1765]; Euler also discussed (in 
[1748]) the reduction of quadratic forms in two and three dimensions. 

The general form of the principal axes theorem asserts that any symmetric 
quadratic form (Ax,x) = La,;x,;x; on R" can be rewritten by means of an orthogonal 
transformation T: R"—> R" in the normal form 2/,x;?. (A is symmetric if «,;= «;;, 
and T is orthogonal if it leaves invariant the Euclidean metric on R".) The generaliza- 
tion from R* to R" of the algebraic part of this theorem (that a quadratic form can be 
written as a sum of squares) was discussed by Joseph Louis Lagrange (1736-1813) in 
a paper [1759] on the maxima and minima of functions of several variables. In 


1973] HIGHLIGHTS IN THE HISTORY OF SPECTRAL THEORY 361 


[1827] Carl Gustav Jacob Jacobi [1804-1851] investigated the principal axes of 
various quadratic surfaces, and about the same time Augustin-Louis Cauchy (1789- 
1867) showed in [1829] and [1830] that the coefficients A, of the normal form of a 
symmetric quadratic form must be real. 

But it was not until the second half of the nineteenth century that the general form 
of the principal axes theorem was achieved when James Joseph Sylvester (1814-1897) 
and Arthur Cayley (1821-1895) used the notation of matrices to systematize the 
algebraic description of n-dimensional space. In [1852] Sylvester showed explicitly 
that the coefficients 4; in the normal form of (Ax, x) are the roots of the characteristic 
polynomial det (AI — A) = 0; in [1858] Cayley inaugurated the calculus of matrices, 
in which the reduction to normal form corresponded to a diagonalization process 
on the matrix A. Specifically, the principal axes theorem says in the language of 
matrices that each symmetric real matrix A is orthogonally equivalent toa diagonal 
matrix D; in other words, for some orthogonal matrix T, the matrix D = T~!AT 
is in diagonal form. The diagonal entries of D are the eigenvalues of A, that is, the 
roots of the polynomial equation det (AJ — A) =0. 

Although the new concepts of matrix theory had an immediate and profound 
influence on British mathematics, their impact on the continent was relatively minor. 
Especially in Germany bilinear forms continued well into the twentieth century to 
be the principal tool of analytical geometry, and in [1878] Georg Frobenius (1849- 
1917) published a systematic account of matrix algebra entirely in the language of 
bilinear forms. So by the end of the nineteenth century we can discern two versions 
of the principal axes theorem: the reduction to normal form of a symmetric bilinear 
form, and the diagonalization of a real symmetric matrix. 


2. Infinite systems of linear equations. The central fact of modern spectral 
theory is that certain linear operators on infinite dimensional spaces can also be 
presented in ‘‘diagonal’’ form. Thus the second historical taproot of spectral theory 
is the evolution of infinite dimensional theory from finite dimensional cases. T his 
evolution occurred first in algebra—in the solution of systems of linear equations— 
and only much later in geometry. Finite systems of linear equations were solved most 
often throughout the eighteenth and nineteenth centuries by the method of elimi- 
nation, as expounded, for instance, in [1770] and [1779] by Euler and Etienne 
Bézout (1730-1783). In [1750] Gabriel Cramer (1704-1752) introduced for 3 x 3 
systems the rule which now bears his name, although he did not, of course, use the 
concept or notation of determinants. 

Infinite systems of equations were used throughout the eighteenth and nineteenth 
centuries to obtain formal solutions to differential equations by the method of 
undetermined coefficients: if a formal power series with unknown coefficients is 
substituted for the unknown in a given differential equation, the task of solving the 
differential equation is reduced to that of determining the infinitely many unknown 
coefficients. (Of course few at that time worried very much about the convergence of 


362 L. A. STEEN [April 


the power series thus obtained.) If all went well, the infinite system of equations in 
the unknown coefficients would exhibit a recursive pattern which made it possible 
to solve the infinite system by finite dimensional tools. But for this reason precisely, 
these recursive techniques contributed little to the development of a general theory 
of infinite dimensional systems. 

Joseph Fourier (1768-1830) launched the first significant general attack on the 
problem of infinite systems of equations when he attempted to show [1822] that 
every function can be expressed as an infinite linear combination of trigonometric 
terms. The problem of determining the unknown coefficients in these linear combi- 
nations led him directly to the general problem of solving an infinite system of linear 
equations. Fourier’s approach (called the principe des réduites by Frédéric Riesz 
[1913a]) was to solve the first n x n system by ordinary means and let n—> oo. 

Although Fourier’s assertion about the expansion of ‘‘arbitrary’’ functions into 
trigonometric series stimulated intense work on the theory of integration, his method 
of solving infinite systems of linear equations was virtually ignored. More than fifty 
years passed before Theodor K6tteritzsch of Saxony reopened the investigation with 
a paper [1870] in which he attempted to extend Cramer’s rule to infinite systems. 
Seven years later the American astronomer George William Hill (1838-1914) published 
in Cambridge, Massachusetts, a monograph [1877b] in which he successfully applied 
to the infinite dimensional case the theory of determinants which had at that time 
only been established for finite dimensional systems. Hill’s work was first disseminated 
in Europe in [1886a] when G. Mittag-Leffler reprinted it in Acta Mathematica in 
the year following the appearance in France of a paper [1885a] by Paul Appell 
(1855-1930) in which he applied the principe des réduites to determine the coefficients 
of the power series expansion of elliptic functions. 

At this point Henri Poincaré (1854-1912) entered the discussion with two papers 
([1885b], [1886b]) in which he provided a rigorous definition for an infinite deter- 
minant in order to clarify the works of Hill and Appell. The work begun in Paris by 
Poincaré was continued in Stockholm by Helge von Koch (1870-1924) who developed 
between 1890 and 1910 an extensive theory of infinite determinants. Von Koch’s 
first major papers on this subject appeared in [1891] and [1892]; his own survey of 
the field in [1910d] provides further references. The more recent survey [1968] by 
Michael Bernkopf includes a complete discussion of these fundamental papers. 


3. Integral equations. The theory of infinite matrices and determinants might 
have led directly to an elementary spectral theorem if someone had generalized the 
diagonalization form of the principal axes theorem. But the road to spectral theory 
was not that straight: the first spectral theorem was achieved only after infinite 
determinants were applied to integral equations, thereby extending the theory from 
the countably to the uncountably infinite. The formal study of integral equations is 
usually traced back to [1823] and [1826] when the young Norwegian genius Niels 
Henrik Abel (1802-1829) used an integral equation to solve a generalized tautochrone 


1973] HIGHLIGHTS IN THE HISTORY OF SPECTRAL THEORY 363 


problem concerning the shape of a wire along which a frictionless bead slides under 
the influence of gravity. Somewhat later Joseph Liouville (1809-1882) introduced (in 
[1837]) the method of iteration to solve a specific type of integral equation; in 
[ 1877a] Carl Neumann (1832-1925) extended Liouville’s iterative method to a more 
general setting while investigating a boundary value problem for harmonic functions. 

Neumann’s work precipitated considerable research in integral equations, 
especially by Poincaré in France and in Rome by Vito Volterra (1860-1940). But it 
was not until 1900 that the theory of integral equations became especially relevant 
to the history of spectral theory, for in that year the Swedish mathematician Ivar 
Fredholm (1866-1927), then a docent at the University of Stockholm, applied to 
integral equations the theory of infinite matrices and determinants as developed by 
his colleague von Koch. By mimicking von Koch’s technique for expanding infinite 
determinants, Fredholm developed in [1900] his now famous ‘‘alternative’’ theorem 
concerning the solutions @¢ of the integral equation 


1 
(1) d(x) + [ K(x, )o0)dy =x), (SxS). 


Just as Daniel Bernoulli (1700-1784) nearly two centuries earlier had represented 
the vibrating string as the limit of n oscillating particles [1732], so Fredholm conside- 
red the integral equation (1) to be the limiting case of the corresponding linear system 


n 


(2) P(x;) + 2 K(xiy) 0) =), CU Sisn). 
j= 

Fredholm defined a “‘determinant’’ D, for the integral equation (1) which is the 
continuous analog of the classical determinant of the n x n system (2) and showed— 
in exact analogy to the classical theory for (2)—that the integral equation (1) has a 
unique solution which can be expressed as the quotient of two ‘‘determinants’’ 
whenever D, #0; or alternatively, if D, =0, then the transposed homogeneous 
equation @(x) + fo K(y, x) $(y) dy = 0 has nontrivial solutions and (1) is solvable 
if and only if w is orthogonal to each of these solutions. Fredholm’s major paper 
on this subject appeared in [1903a]; a summary of this work together with later 
developments is the substance of his survey article [1910e]. 


4. David Hilbert. Although there is very little in the papers of either von 
Koch or Fredholm that could be construed as a logical ancestor of the modern 
spectral theorem, we have discussed these developments for two particular reasons— 
one mathematical, the other historical. The twentieth century evolution of infinite 
dimensional spectral theory from the much simpler finite dimensional theory is 
foreshadowed by the nineteenth century development of linear equation and deterrni- 
nant theory, from the finite to the infinite (von Koch) to the continuous (Fredholm). 
But there is even a more direct connection, for when Fredholm’s ideas were introduced 
(by Fredholm’s colleague Eric Holmgren) in David Hilbert’s 1900-01 seminar at 


364 L. A. STEEN [April 


Gottingen, Hilbert, in the words of Hermann Weyl [1944], ‘‘caught fire at once’’. 

For the next ten years Hilbert (1862-1943) focused his impressive mathematical 

talent exclusively on integral equations, and through a series of six papers published 

in Géttingen Nachrichten from 1904 to 1910 (collected and published as one volume 

in [1912a]) he outlined the basic definitions and theorems of spectral theory (which 

he named) and Hilbert space theory (which he did not name, or even define directly). 
Hilbert worked primarily with the integral equation 


(3) (x) — 2 [, K(x, y) 60) dy = W(x) 


together with the analogous finite or infinite dimensional matrix equation 
(4) b(x,) AZ K(%,¥)0) = VO). 
J 


In the process of constructing the machinery necessary to solve these equations, 
Hilbert defined the spectrum of the quadratic form K, distinguished the point 
spectrum from the continuous spectrum, and defined the concept of complete 
continuity which served to separate those forms that had pure point spectra from 
those with more complicated spectra. But most important from the viewpoint of 
this essay, he formulated and proved the spectral theorem—not only for completely 
continuous forms, but for bounded forms as well. 

Hilbert’s papers on integral equations contain an astonishing quantity of what 
we now recognize as modern analysis in classical language. Because he was primarily 
concerned with solving integral equations, Hilbert never applied his results speci- 
fically to matrices or operators; furthermore, because of the position of the parameter 
A in equation (3), all of Hilbert’s eigenvalues and spectral points are reciprocals of 
those in use today. And while his theorems had a most modern thrust, his basic 
method of proof was that of Bernoulli and Fredholm—a laborious passage to the 
limit from the corresponding finite case. 

Beginning in 1905 with his doctoral dissertation under Hilbert, Erhard Schmidt 
(1876-1959) generalized and simplified Hilbert’s work by introducing the suggestive 
language of Euclidean geometry. In [1907a ], [1907b] and [1908a] Schmidt presented 
a definitive theory of ‘‘Hilbert’s space’’—what we now call /?, the space of square 
summable sequences—replete with the language of norms, linearity, subspaces and 
orthogonal projections. (It was Schmidt who generalized to |? the iterative algorithm 
for orthonormalization first introduced in [1883] by Jorgen Pederson Gram of 
Copenhagen.) Schmidt’s conceptual simplifications were immediately incorporated 
by Ernst Hellinger (1883-1950) and Hermann Wey] (1885-1955) in their 1907 and 
1908 dissertations under Hilbert. In [1909a] Hellinger reformulated the theory of 
quadratic forms in the new language of Hilbert and Schmidt, and in the same year 
Weyl] published an extensive study of bounded forms and their spectra [1909d]. So 


1973] HIGHLIGHTS IN THE HISTORY OF SPECTRAL THEORY 365 


by the end of the first decade of the twentieth century we can perceive in the writings 
of Hilbert and his pupils the major part of spectral theory for bounded linear 
transformation on 1?. 


5. Hilbert-Schmidt spectral theory. Recall, as did Hilbert at the beginning 
of his first paper on integral equations [ 1904a], the principal] axes theorem for finite 
dimensional spaces. Let 4, SA, S--- SA, be the n (real) eigenvalues of the sym- 
metric n x n matrix K, listed according to multiplicity. Let ¢, --- @, be an orthonor- 
mal collection of corresponding eigenvectors, so K¢; = 1,0; for 1 Si <n. Then the 
action of K is represented, with respect to the basis ¢, ---¢,, by the diagonal matrix 
L with entries A; on the main diagonal. The matrix T whose rows are the vectors 
, ++: d, is an orthogonal] transformation which maps the new basis vectors ¢, --- @, 
back to the original (canonical) basis vectors. Thus L = T~'K T is the diagonalization 
of K by the orthogonal transformation T. The matrix L can be written as %;_ ,/,P; 
where P, is the projection (i.e., the transformation which projects R"”) onto the one 
dimensional subspace spanned by 4. 

Hilbert’s first step in extending this theorem was to generalize the concept of 
eigenvalue to the case of an infinite symmetric form K. His new concept was the 
spectrum of K, denoted by o(K), which is the set of 4 for which the transformation 
Al — K is not invertible. (Actually Hilbert used I — AK while Schmidt used AI — K.) 
The subset of o(K) consisting of those 4 for which the equation K@ = A¢ has non- 
trivial solutions is called the point spectrum of K; this is the strict analog of the set 
of eigenvalues. The complement of the point spectrum in o(K) is called the continuous 
spectrum. Much of Hilbert’s fourth paper [1906a] is devoted to a study of the 
relationships between a transformation K and its spectrum o(K). 

One of the simplest relationships Hilbert discovered was that the spectrum of K 
is a bounded set whenever K is a bounded transformation—that is, whenever the set 
S = {| Kx||: |x|] < 1} is bounded, where the notation | ||, due to Schmidt, is the 
1 norm. In fact, whenever K is symmetric, the least upper bound of S, called the 
bound (or norm) of K and denoted by | K |, is the same as the least upper bound of 
{|A|: 4€0(K)}; this fact is now called the spectral radius theorem. The bounded 
linear transformations on /* are important from another point of view, also due to 
Hilbert: they are precisely the continuous linear transformations, in the sense that 
they preserve strong convergence (i.e., || Kx, — Kx || > 0 whenever || x, — x] > 0). 

Hilbert extended the principal] axes theorem to symmetric bounded linear trans- 
formations; the spectra of these transformations are bounded subsets of the real 
axis. Those 4 in the point spectrum p(K) of K are like eigenvalues since there exists 
an orthonormal collection of corresponding eigenvectors @¢, satisfying Kd, = 1¢,. 
If P, denotes the projection onto the subspace generated by ¢,, we can form the 
diagonal transformation L= LAP, where A ranges over the point spectrum p(K). 
The transformation L reflects accurately the action of K on the subspace generated 
by the eigenvectors @,, but since this subspace will in general be strictly smaller than 


366 L. A. STEEN [April 


12—-since we have omitted the continuous spectrum—we cannot say that E and K 
represent the same transformation. 

To express the contribution of the continuous spectrum, Hilbert set up an 
integral patterned after one defined in [1894] by the Dutch mathematician Thomas- 
Jean Stieltjes (1856-1894). In his study of continued fractions, Stieltjes was led (via 
the problem of moments) to the integral [,f(x) dg(x) as the limit of the sum Xf(é,) 
[e(x;) — g(x;-1)] (for continuous f and increasing g). By rewriting the sum 1 A;P,, 
as LA, E,, — E,,_,], where Ey, = D5 -1P, ,;» Hilbert constructed for the continuous 
spectrum s(K) the Stieltjes-type integral |,.,,AdE, as the limit of sums of the form 
Y AE, — £,,_,]. Then Hilbert’s spectral theorem was that every symmetric bounded 
linear transformation on /* can be represented (by means of an orthogonal 
transformation) in the “‘diagonal’’ form | 


(5) xX AP, + | AdE, 

p(K) s(K) 
where the summation is over the point spectrum, and the integral is over the con- 
tinuous spectrum. 

Hilbert completed his spectral theory by identifying a large class of transformations 
whose continuous spectra were empty. He called these transformations completely 
continuous, and Schmidt characterized them by the property of mapping weakly 
convergent sequences to strongly convergent sequences. In other words, the linear 
transformation K is completely continuous if 


|| Kx, — Kx || > 0 


whenever (y,X,) > (y,x) for all y. The completely continuous transformations are 
the nearest infinite dimensional analog to the finite dimensional transformations, 
since their spectra consist entirely of eigenvalues with zero as the only possible 
accumulation point; furthermore, every completely continuous symmetric linear 
transformation K can be expressed (by an orthogonal] transformation) in the diagonal 
form & A P, (since s(K) = ¢). 

Although Hilbert originally used infinite matrices merely as convenient ap- 
proximations to integral equations, he concluded his theoretical investigation by 
establishing a major link between these two theories, namely that of a complete 
orthogonal system. Such a system {@,,}, either of vectors in the sequence space /? or 
of continuous functions on the interval [0,1], is characterized by the orthogonality 
relation (¢,,¢,,) = 0 if n ¥ m, together with the fact that every vector (or continuous 
function) ¢ can be represented by the Fourier-type series 6 = &,*,a,@,. The matrix 
equation (4) can then be derived (by mathematics, rather than by analogy) from the 
integral equation (3) by replacing each continuous function ¢, w by its Fourier 
expansion with respect to the complete orthogonal system {¢,,}. This application of 
a complete orthogonal system enabled Hilbert to derive Fredholm’s alternative 


1973] HIGHLIGHTS IN THE HISTORY OF SPECTRAL THEORY 367 


theorem for the integral equation (1) directly from the corresponding theorem for 
the infinite linear system (2). 

To keep the record straight, we should emphasize again that Hilbert introduced 
spectral theory in the language of quadratic forms, whereas we have reported his 
work primarily in the language of linear transformations on the infinite dimensional 
space /?. Barely fifty years had elapsed since Cayley in England and Hermann 
Grassman (1809-1877) in Germany had begun, in [1843] and [1844] the systematic 
study of Euclidean n-dimensional space for n> 3. Hilbert and Schmidt were the 
first to explore the totally unknown depths of an infinite dimensional space and it 
was not until other such spaces were studied that the broad outlines of a theory of 
linear transformations became clear. The early twentieth century development of 
infinite dimensional (function) spaces is recorded in [1966]. 


6. The Lebesgue Integral. At about the same time as Hilbert was creating his 
spectral theory for spaces of square summable sequences, Henri Lebesgue (1875-1941) 
was developing the new integral which now bears his name ({1901], [1904c]). In 
three brief papers in 1907 Friedrich Riesz (1880-1956) and Ernst Fischer (1875-1959) 
joined together the works of Hilbert and Lebesgue by showing that Hilbert’s space /? 
is isomorphic to the space I? of functions whose square is Lebesgue integrable. In a 
subsequent paper [ 1910c] (in which he introduced the more general L’ spaces), Riesz 
derived a spectral theory for [7 entirely analogous to that developed for /? by Hilbert 
and Schmidt. 

In the year preceding the appearance of his paper on L? spaces, Riesz proved in 
[1909b] his now famous representation theorem in which he solved a problem first 
studied in [1903b| by Jacques Hadamard (1865-1963). What Riesz showed was that 
every continuous linear functional on C([a,b]) is a Stieltjes integral [ fdg with 
respect to some function g of bounded variation. Lebesgue then showed in [ 1910f ], 
in direct response to Riesz’s paper, that every Stieltjes integral can be interpreted as 
a Lebesgue integral under a proper interpretation of the heuristic formula 


[reoaseo - f0oe'(dx. 


This led Johann Radon (1887-1956) to develop (in [1913b]) integration with respect 
tO a measure (i.e., a countably additive set function) thus encompassing the integrals 
of both Lebesgue and Stieltjes and providing the foundation for all modern the- 
ories of the abstract integral. 

We can see from this digression that the evolution of the modern integral was 
closely connected to Hilbert’s creation of spectral theory. Although neither theory 
depended logically on the other, the historical dependence of each on the other is 
quite clear: Hilbert used Stieltjes’ integral to obtain the spectral theorem for /?, 
while Riesz, following Hilbert, used and thereby immortalized Lebesgue’s integral 
by developing the spectral theory of I’. 


368 L. A. STEEN [April 


The second decade of spectral theory was rather uneventful. In Gdttingen, 
Hilbert had turned his attention to the axiomatization of physics, a task which he had 
proposed to the International Congress of Mathematicians in 1900 as the sixth of his 
famous 23 problems for twentieth century mathematics. ‘‘Physics,’’ he said, ‘‘is 
much too hard for physicists’ ([ 1970 ]). In the United States Eliakim Hastings Moore 
(1862-1932) at the University of Chicago developed a system of “‘general analysis” 
([1908b], [1912b]) which was designed to include as special cases the work of 
Hilbert, Fredholm and Riesz. But Moore’s results were constrained by the fact that 
European investigators were not then accustomed to receiving new mathematical ideas 
from America. So while Moore’s research had a profound effect on the development 
of mathematics in the United States, it did not influence significantly the direction of 
research on spectral theory. 

Many European efforts from 1910 to 1925 were devoted to exposition and recapi- 
tulation. Riesz [1913a], Fredholm [1910e] and von Koch [1910d] published surveys 
of the theory of infinitely many variables and integral equations, each of which 
contained various forms of Hilbert’s spectral theory. Hilbert’s collected papers on 
integral equations were themselves published in book form in [1912a]. But certainly 
the most impressive survey work of this period was the massive Enzyklopddie der 
Mathematischen Wissenschaften which contains in volume II.3.2. a comprehensive 
discussion of integral equations and spectral theory by Hellinger and Otto Toeplitz 
(1881-1940); this survey paper was also published separately [1928a ]. 


7. Quantum mechanics. [n Gottingen in 1925-26 Werner Heisenberg (1901-) 
and Erwin Schrodinger (1887-1961) created the theory of quantum mechanics. In 
Heisenberg’s theory the physical fact that certain atomic observations cannot be 
made simultaneously was interpreted mathematically to mean that the operations 
which represented these observations were not commutative. Since the algebra of 
matrices is non-commutative, Heisenberg together with Max Born and Pascual 
Jordan ([1925a], [1926a]) represented each physical quantity by an appropriate 
(finite or infinite) matrix, called a transformation; the set of possible values of the: 
physical quantity was the spectrum of the transformation. (So the spectrum of the 
transformation which represented the energy of an atom was precisely the spectrum 
of the atom.) 

Schrodinger, in contrast, advanced a less unorthodox theory based on his partial 
differential wave equation. Following some initial surprise that Schrédinger’s ‘‘wave 
mechanics’’ and Heisenberg’s “‘matrix mechanics’’—two theories with substantially 
different hypotheses—should yield the same results, Schrodinger unified the two 
approaches by showing, in effect, that the eigenvalues (or more generally, the spec- 
trum) of the differential operator in Schrdédinger’s wave equation determine the 
corresponding Heisenberg matrix. Similar results were obtained simultaneously 
([1925b], [1926b]) by the British physicist Paul A. M. Dirac (1902-). Thus interest 
in spectral theory once again became quite intense. 


1973] HIGHLIGHTS IN THE HISTORY OF SPECTRAL THEORY 369 


Hilbert himself was astonished that the spectra of his quadratic forms should 
come to be interpreted as atomic spectra. “‘I developed my theory of infinitely many 
variables from purely mathematical interests, and even called it ‘spectral analysis’ 
without any presentiment that it would later find an application to the actual spectrum 
of physics’’ [1970]. It quickly became clear, however, that Hilbert’s spectral theory 
was the proper mathematical basis for the new mechanics. Finite and infinite matrices 
were interpreted as transformations on a Hilbert space (still thought of primarily as 
I? or EF?) and physical quantities were represented by these transformations. The 
mathematical machinery of quantum mechanics became that of spectral analysis 
and the renewed activity precipitated the publication by Aurel Wintner (1903-1958) 
of the first book [1929b] devoted to spectral theory. 


Hilbert’s original spectral theorem applied to real quadratic forms (or infinite 
matrices) that were bounded and symmetric. This theorem was quickly and easily 
extended (by Schmidt and others) to bounded complex matrices A = (a;,) for which 


a;;=d,;;; such matrices are called Hermitian after the French mathematician 
Charles Hermite (1822-1901) who introduced them (in [1855]) and proved their 
eigenvalues real. Both symmetric and Hermitian forms may be characterized in 
terms of their respective inner product by the relation (Ax,y) =(x,Ay) for all 
x,y. Like symmetric matrices, Hermitian transformations have real spectra and, 


more generally, play the role of the real number line in the algebra of transformations. 


Almost miraculously, it was precisely the Hermitian transformations which 
qualified in the new machanics to represent a physical quantity. One reason for this 
is that physical quantities are measured by real numbers, so it is natural to represent 
them by those transformations which behave like real numbers. Perhaps a more 
compelling justification is that the hypothesis that the transformations of mathema- 
tical physics are Hermitian implies certain fundamental laws (or assumptions) of 
physics: if A is Hermitian, the wave equation ¢ = Ad implies the conservation of 
energy, a fundamental law of classical mechanics, and the solutions of Schrédinger’s 
equation ¢ = iA@ will have constant norm, which is a fundamental assumption of 
quantum mechanics. 


Although every observable was represented in the new mechanics by a Hermitian 
transformation, it was not necessarily true that every such transformation repre- 
sented an observable. Dirac [1930b] added the crucial hypothesis that a Hermitian 
transformation represents an observable if and only ifits eigenvectors form a complete 
(orthogonal) system: his hypothesis was designed to insure that any vector (rep- 
resenting a quantum mechanical state) could be expressed as a (possibly infinite) 
linear combination of eigenvectors of any given observable. The identification of 
transformations with this property is part of the Hilbert-Schmidt spectral theory, but 
this theory provided only a partial answer: those Hermitian transformations which 
are completely continuous have a complete set of eigenvalues. 

This theorem did not provide a satisfactory elucidation of Dirac’s hypotheses 


370 L. A. STEEN [April 


since the transformations of quantum mechanics are usually not completely contin- 
uous. Most of the important transformations in physics involve differentiation of, 
say, functions in L*. The theorem on integration by parts shows that differentiation is 
formally symmetric, for in this case (Af, g) =(f,Ag) means [f’g = [fg’. But since 
the derivative of a function has practically no relation to the magnitude of the 
function, differentiation is neither continuous nor bounded, nor even defined every- 
where. In fact, if a symmetric of Hermitian transformation (like differentiation) were 
defined everywhere, it would have to be bounded. This rather surprising result— 
which says, in effect, that a candidate for the spectral theorem which fails to be 
bounded must fail to be everywhere defined—was demonstrated as early as [1910b | 
by Hellinger and Toeplitz. 

Thus many of the transformations of quantum mechanics, although Hermitian, 
failed nevertheless to satisfy the second of Hilbert’s hypotheses, namely, that they be 
bounded. Like differentiation, they were unbounded and defined only on a dense 
subset of L?. Paul Dirac attempted to overcome the exceptional behavior of dif- 
ferentiation by introducing his 6-function to provide derivatives where none existed 
and thereby to enlarge the set of functions to which the differentiation transformation 
could be applied. Dirac’s approach was highly successful in explaining the new 
quantum mechanics and led eventually to Laurent Schwartz’ theory of distributions 
precisely because it lacked an adequate mathematical foundation. But in 1926 
Dirac’s approach represented more an alternative to spectral theory than an 
extension of it, and it did not really help to extend Hilbert’s theory to 
unbounded transformations. 


8. John von Neumann. After Hilbert, the only major study of unbounded 
transformations was that published in [1923] by Torsten Carleman (1892-1949) in 
Sweden. In this monograph Carleman showed that many of the results of Fredholm 
and Hilbert still hold under a weaker type of boundedness hypothesis. But from the 
viewpoint of spectral theory, the major breakthrough came in 1927-29 when the 
twenty-five year old Hungarian John von Neumann (1903-1957) revolutionized the 
study of spectral theory by introducing the abstract concept of a linear operator 
on Hilbert space. In [1927] von Neumann expressed the transformation theory of 
quantum mechanics in terms of operators ona Hilbert space, and explicitly recognized 
the need to extend from the bounded to the unbounded case the spectral theory of 
Hermitian operators. In [1929a] he carried out that extension. 

Before von Neumann, the name “‘Hilbert space’’ had been applied principally 
to the space /? of square summable sequences (often called ‘‘Hilbert’s space’’) or to 
the space I? of Lebesgue square integrable functions which Riesz had proved 
isomorphic to /*. The essential properties of these spaces, widely recognized, were 
those of a vector space with an inner product which was complete and separable 
(i.e., which had a countable dense subset). Von Neumann’s first step in his theory 
of linear operators was to define an (abstract) Hilbert space axiomatically as any 


1973] HIGHLIGHTS IN THE HISTORY OF SPECTRAL THEORY 371 


separable, complete inner product space. He then defined a general linear operator 
on the abstract Hilbert space H asa linear transformation defined on some subset 
of H. This subset, called the domain D,; of the operator T, is usually assumed to be 
a linear subspace of H, which, like the domain of the differentiation operator, is 
dense in H. Von Neumann’s linear operators thus comprehend both the matrices 
and quadratic forms of Hilbert’s theory, and the transformations of quantum 
mechanics. 

A linear operator is continuous if and only if it is bounded, and a bounded linear 
operator with a dense domain can be uniquely extended to a bounded linear operator 
on the whole space H. Every linear operator T with a dense domain has a unique 
adjoint operator T* defined by the relation 


(6) (Tx, y) = (x, T*y) 

for all xe D,;; the domain of T* is the set of y € H for which (6) holds for all x. An 
operator T is called self-adjoint if T = T*, and symmetric if T* is an extension of 
T, or equivalently, if (Tx,y) =(x,Ty) whenever x, yeD,. (In von Neumann’s 
papers, the self-adjoint operators were called hypermaximal.) 

Every self-adjoint operator is clearly symmetric, and every symmetric operator 
which is everywhere defined must be self-adjoint. Thus for bounded linear operators 
(which either are everywhere defined or may be extended to become so) the concept 
of symmetric and self-adjoint coincide. The Hellinger-Toeplitz theorem, cited in 
section 7 above, can be extended to von Neumann’s operators and shows that any 
symmetric operator which is everywhere defined must be bounded. (This result is 
closely related to a more general theorem due to Stefan Banach (1892-1945), now 
commonly known as the closed graph theorem [1932b |.) Thus in von Neumann’s 
theory there are precisely three types of symmetric operators: 

I. bounded, self-adjoint and everywhere defined ; 

II. unbounded, self-adjoint and densely but not everywhere defined; and 

III. unbounded, not self-adjoint, and densely but not everywhere defined. 

Hilbert’s original theory applied to operators of type I, while von Neumann’s 
spectral theorem encompassed those of type II as well since it applies to all self- 
adjoint operators. This theory, though initiated by von Neumann, was developed by 
Riesz [ 1930c] and more extensively, by Marshall H. Stone (1903-) at Yale University 
who expounded it in great detail in [1932a]. The combined (but largely independent) 
efforts of von Neumann and Stone for the five year period 1927-1932 provided for 
spectral theory the largest collection of new methods since Hilbert’s five year effort 
of 1901-1906. 


9. Von Neumann — Stone Spectral Theory. Hilbert’s general spectral theorem 
says that every bounded symmetric linear transformation T can be written in the 
form 


» iP, + | AdE,. 


p(T) (T) 


372 L. A. STEEN [April 


By rewriting the first sum as a Stieltjes-type integral and combining it with the 
second integral, we may express Hilbert’s spectral theorem in the concise form 


(7) T= AdE,, 
o(T) 
where the integral is over the entire (bounded) spectrum of T. The operators E, are 
projections with the following properties: 
(i) If A < pw, the range of E, is contained in the range of E,; 

(ii) Ife >0, E,,,7EF, ase); 

(iii) E, >0 as i> — ~; 

(iv) E, > ITasi>a+o. 

Stone called such a family of operators a resolution of the identity; in more intuitive 
language, properties (i)—-(iv) require that the function A— E, be increasing, con- 
tinuous from the right, with 0 and J as left and right limiting values. 

The von Neumann-Stone extension of the spectral theorem for self-adjoint 
operators from the bounded to the unbounded case corresponds to the extension of 
(7) from bounded to unbounded spectra o(T). Specifically, it says that to each self- 
adjoint operator T there corresponds a unique resolution of the identity {E,} such 
that (7) holds. . 

Despite the power of this theorem, many differential operators are not covered 
by it since they are rarely self-adjoint. For instance, to make the operator D = d/dt 
symmetric on a dense subset A of the Hilbert space L7(0,1), we should select for A 
the subset consisting of those continuously differentiable functions f which satisfy 
f(0) = f(1) = 0 (in order to insure that the relation (Df, g) =(f,Dg) would follow 
by integration by parts). But the domain A is too small to permit D to be self-adjoint, 
for every continuously differentiable L? function is in the domain of D*, To make D 
self-adjoint we would have to enlarge its domain appropriately—thereby risking a loss 
of symmetry. Each symmetric operator of type III suffers from the same disease: its 
domain is smaller than that of its adjoint. Moreover the cure—namely, extension 
of the domain—is often fatal since with a larger domain the operator may fail to be 
symmetric. 

To apply his spectral theorem to symmetric operators von Neumann had to 
know which types of symmetric operators admit self-adjoint extensions. He [1929a | 
and Wintner [ 1929b] identified a large class of such operators, namely those operators 
T, called semibounded, for which there is a positive constant M satisfying either 
(Tx,x) $M |x| for all xeD, or — M||x|| S$ (Tx, x) for all x € Dy. The best state- 
ment of this result is due to Stone [1932a] and Kurt O. Friedrichs [1934]: every 
semibounded symmetric operator may be extended to a semibounded self-adjoint 
Operator with the same bound. 

Whereas the central focus of the von Neumann-Stone spectral theory (and of 
Hilbert’s also) is on operators with real spectra, the spectral theorem does apply, at 


1973] HIGHLIGHTS IN THE HISTORY OF SPECTRAL THEORY 373 


least in two cases, to operators with more general spectra. The simplest case concerns 
isometric operators which leavethe inner product on H invariant; from this definition 
it follows easily that the spectrum of an isometric operator is a subset of the unit 
circle. An isometric operator that maps H onto H is called unitary and is charac- 
terized by the fact that its adjoint is its inverse (ie., TT* = T*T= J). Unitary 
operators were first studied in [1909c] by Isaac Schur following their introduction 
by Léon Autonnein [1902]. In[1929a] von Neumann employed the Cayley transform 
C: T>(T—il(T+il)~! to map symmetric operators T into isometric operators 
C(T); he showed that T is self-adjoint if and only if C(T) is unitary. Thus the spectral 
theory for unitary operators follows from that for self-adjoint operators by use of a 
spectral integral on the unit circle instead of on the real line. 

Now every bounded linear operator T can be written in the “‘Cartesian’’ form 
T = A +iB, where A and B are bounded and self-adjoint; in fact, 


A=1(T+T*), B=5-(T-T%). 


Thus it would appear likely that the spectral theorem could be extended to all 
bounded linear operators by using this decomposition. However, the details of that 
extension require that AB = BA (or equivalently, that TT* = T*T). So the desired 
extension works only for those operators which commute with their adjoints: such 
Operators are called normal, after Toeplitz [1918a]. Toeplitz extended Hilbert’s 
spectral theorem to completely continuous normal quadratic forms by showing that 
such a form was unitarily equivalent to a diagonal form. More generally, the spectral 
resolution 


r=| 1 dE, 

a(T) 

extends to bounded normal operators, where the integration is over the spectrum of 
T which is a compact subset of the complex plane contained in the disc of radius 
T |. Von Neumann [1930a] and Stone [1932a] extended both the definition and 
spectral theory of normal operators to the unbounded case as well. 

We have come a long way from the principal axes theorem, and the spectral 
theorems of von Neumann and Stone reflect far more analysis than geometry. The 
geometric content of the spectral theorem for finite dimensional space is that the 
entire space can be expressed as the direct sum of subspaces on each of which the 
given transformation acts like simple multiplication. But this theorem fails in the 
infinite dimensional cases as soon as the continuous spectrum appears. In a paper 
written in 1938 but not published until [1949a |, von Neumann effectively resuscitated 
the geometrical spectral theorem by defining a direct integral of Hilbert spaces (in 
strict analogy with the direct sum). He then showed that the action of a self-adjoint 
operator on any Hilbert space could be represented as the accumulated effect of 


374 L, A. STEEN [April 


simple multiplications on certain subspaces whose direct integral was (unitarily 
equivalent to) the original space. 


10. Gelfand-Naimark Theorem. The collection of all operators on a Hilbert space 
forms a ring; such rings, with various topologies, were extensively investigated by 
von Neumann and Francis J. Murray in [1936a ], [1937a] and [1940a]. During the 
same period 1936-40 S. W. P. Steen published in England a series of five papers 
([1936b], [1937b], [19384], [1939], [1940b]) devoted to an axiomatic theory of 
operators, But the papers that offered the most significant insight into the spectral 
theorem were [1941a], [1941b] and [1943] published in the U.S.S.R_ by Israel 
M. Gelfand, Mark A, Naimark and Georgii E. Silov. Gelfand and his colleagues 
created a theory of normed rings which not only subsumed much of the work of von 
Neumann, Murray and Steen on rings of operators, but also provided a beautiful 
general setting for the study of Fourier transforms and harmonic analysis. Related 
studies were carried out in the United States by Stone ([1940c], [1941c]) and Shizuo 
Kakutani [1941d]. 

Normed rings were first introduced in [1936c] by the Japanese mathematician. 
Mitio Nagumo under the name of ‘“‘linear metric rings’’. In [1946] Charles E. 
Rickart christened Gelfand’s normed rings ‘‘Banach algebras’’ to avoid mis- 
understanding due to the algebraic meaning of “‘ring’’; as a consequence Russian 
mathematicians now use the former name, while Americans use the latter. But 
regardless of its name, the properties of a Banach algebra are those of a complete 
normed algebra (over the complex field C) satisfying the multiplicative triangle 
inequality || x || | »|| < || xy ||. We shall assume that each Banach algebra contains an 
identity element e, where |/e|| = 1. The set of all bounded linear operators on a 
Hilbert space is a Banach algebra, as is the set of all continuous complex-valued 
functions on a compact topological space X (with the sup norm |f| = sup 
{ | f (x)| :xe€X}). The part of Banach algebra theory germane to spectral theory is 
the relation between these two examples. 

Gelfand’s theory of commutative Banach algebras depends on three fundamental 
concepts: homomorphisms, maximal ideals and spectra. A homomorphism of a 
commutative Banach algebra B is a non-zero multiplicative linear functional; its 
kernel is a maximal ideal since it is contained in no larger proper ideal. Moreover, 
every maximal ideal J is the kernel of some homomorphism for in this case the factor 
algebra B/I is the field C of complex numbers (according to a result announced by 
Stanislaw Mazur [1938b] and proved by Gelfand [1941a]) so the composite map 
B-— B/I-—C is a homomorphism of B whose kernel is J. The set M, of homo- 
morphisms (or equivalently, of maximal ideals) of B is given the weakest topology 
relative to which all of the functions ¥: h > h(x) are continuous, for all x é B. Then 
the topological space M, is compact and Hausdorff, and each element x of B is 
represented in C(M,) (the Banach algebra of continuous complex valued functions 
on M,) by its “Gelfand transform’ x. 


1973] HIGHLIGHTS IN THE HISTORY OF SPECTRAL THEORY 375 


In strict analogy with the spectral theory of operators on a Hilbert space, Gelfand 
defines the spectrum a(x) of an element x € B to be the set of complex numbers A fcr 
which the element x — Ae has no inverse. The set a(x) is compact, non-empty and 
contained in the disc of radius || x ||. Furthermore, o(x) happens to be precisely equal 
to the range of the Gelfand transform X: a(x) = {h(x) | he M,}. For this reason the 
space M, of maximal ideals 1s often called the spectrum of the Banach algebra B. 
If B is the algebra generated by a single element x (such as a particular operator 
on H), then the spectrum of the algebra B is mapped homeomorphically by % onto 
the spectrum of x. 

In [1943] Gelfand and Naimark showed that the commutative Banach algebra 
C(M,) is characterized by the presence of an involution, namely the operation of 
complex conjugation *: f > f. Specifically, they showed that any commutative Banach 
algebra with an involution (called a B*-algebra) is isometrically isomorphic to the 
a!gebra C(M ,) for some Banach algebra B. In particular, the commutative B* algebra 
B(T) generated by a given bounded normal operator T is isomorphic to the algebra 
C(a(T)) of all continuous functions on o(T), the spectrum of B(T); T is assumed 
normal in order that the presence of the involution *: T— T* should not destroy the 
commutativity of the algebra B(T). 

The impact of the Gelfand-Naimark theorem on spectral theory is this: the 
spectral theorem for a bounded normal operator T can be inferred via the isomorphism 
between B(T) and C(o(T)) from a corresponding theorem concerning continuous 
complex valued functions on o(T). The required theorem is just that every continuous 
function f on o(T) (in particular, the identity function f(A) = 4) can be approximated 
uniformly by measurable step functions of the form Lf(A;)x,,, where x,, is the 
characteristic function of the measurable set A,. The translation of this theorem to 
the algebra B(T) (in the special case f(1) = 4) is the spectral resolution of the bounded 
normal operator T: T= [{AdE,. In words instead of symbols, the approximation 
theorem says that a continuous furction can be approximated by linear combinations 
of characteristic functions, while the spectral theorem says that bounded normal 
operators can be approximated by linear combinations of projections. Thus Gelfand’s 
theory of Banach algebras revealed that the spectral theorem is in some fundamental 
sense equivalent to a most rudimentary fact in the theory of functions. 


Gelfand’s theory actually yields a spectral theorem far stronger than those which 
we have so far discussed. By translating the approximation theorem for an arbitrary 
continuous function f we obtain a spectral resolution of the form f(T) = [f(A)dE,. 
This formula was originally introduced by von Neumann and Stone as the basis of 
their “‘operational calculus’’. A related general spectral theorem, also due to von 
Neumann [1930a], can be inferred from the Gelfand-Naimark isomorphism: any 
commutative family of bounded normal operators admits a simultaneous diagonaliza- 
tion—that is, a single resolution of the identity which simultaneously represents all 
operators in the family by mzans of the integral [f(A)dE, for various functions f. 


376 
Steen 
193640 


Heisenberg 
Schrodinger 
1925-26 


Frobenius 
1878 
Sylvester 
1852 


Jacoby 
1827 


Cauchy 
1829-30 


Principal axes 
theorem 


Lagrange 
1759 


Fermat 
1679 
Descartes 
1637 


Analytical 
Geometry 


L. A. STEEN 


Gelfand 


Silov Murray 
Naimark Von Neumann 
1941-43 1936—40 


Stone, 1932 
Riesz, 1930 
Von Neumann 
1927, 1929 


Lebesgue 
1910 


Fischer 
Riesz 
1907-10 


Lebesgue 
1901-04 


Fredholm Stieltjes 
1900-03 1894 
Von Koch 
1891-92 


Poincare, 1885-6 
Appell, 1885 


Hill, 1877 


Kotterisch 
1870 


Neumann 
1877 


Integral 
equations 


Liouville 
1837 


Infinite systems 
of linear equations 


Fourier 
1822 


Bernoulli 
1732 


Algebra 


Analysis 


Fic. 1 


[April 


1973] HIGHLIGHTS IN THE HISTORY OF SPECTRAL THEORY 377 


11. Unfinished business. This concludes our saga of the spectral theorem. Our 
historical vision has been deliberately narrow, focused throughout on the evolution 
of just one theorem and only rarely have we glanced at the many fascinating ap- 
plications anb extensions of the basic theory. For example, spectral theory for 
spaces without inner products can be traced back to Riesz [1918b] and T.H. Hil- 
debrandt [1931], while the rudiments of spectral theory for differential operators 
are contained in the work |[1908c] of George Birkhoff; in [1928b] and [1930d] 
Norbert Wiener developed a theory of spectral analysis for functions in an attempt 
to analyze mathematically the spectrum of white light, while twenty years later Arne 
Beurling [1949b] inaugurated the complementary study of spectral synthesis; and 
in [1942] Edgar R. Lorch, continuing work begun in [1913a] by F. Riesz, inves- 
tigated spectral sets in the plane by means of contour integrals. 

Had we stopped to investigate each such offshoot our evolutionary tree (Figure 1) 
would have looked like a forest. Indeed, it took Nelson Dunford and Jacob Schwartz 
nearly 3000 pages to survey spectral theory ([1958a], [1963], [1971]). So any who 
are inspired to examine the fruits of spectral theory are invited to read this treatise 
or any ofits many less ambitious companions ([1951], [1953], [1958b], [1962]). 
Our mission to describe the roots and main trunk of spectral theory is accomplished. 


References 


1637 Rene Descartes, La Géométrie, Appendix I to Discours de la Méthode, Leiden, 1637,( Oeuvres, 
VI, 367-514). 

1679 Pierre de Fermat, Ad locus planos et solidos isagoge, Varia Opera Mathematica, 1679, (Oeuvres, 
I, 91-110). 

1732 Daniel Bernoulli, Theoremata de oscillationibus corporum filo flexili connaxorum et catenae 
verticaliter suspensae, Comm. Acad. Scient. Imper. Petropolitanae, 6 (1732) 108-122. 

1748 Leonhard Euler, Introductio in Analysin Infinitorum, 1748, (Opera Omnia (1), [X, 379-392). 

1750 Gabriel Cramer, Introduction a l’analyse des lignes courbes algébriques, Geneva, 1750. 

1759 Joseph-Louis Lagrange, Recherches sur la méthode de maximis et minimis, Miscellanea 
Taurinensia, 1 (1759) 18-42, (Oeuvres, I, 3-20). 

1765 Leonhard Euler, Theoria Motus Corporum Solidorum Seu Rigidorum, 1765, (Opera Omnia 
(2) III, 193-214). 

1770 Leonhard Euler, Vollstandige Ableitung zur Algebra, St. Petersburg, 1770. 

1779 Etienne Bézout, Théorie générale des équations algébriques, Paris, 1779. 

1822 Joseph B. J. Fourier, Théorie analytique de la chaleur, Paris, 1822. 

1823 Niels Henrik Abel, Solution de quelques problémes 4a l’aide d’intégrales définies, Magazin for 
Naturv., 1 (1823) 205-215, (Oeuvres, I, 11-27). 

1826 Niels Henrik Abel, Résolution d’un probléme de mécanique, J. Reine Angew. Math., | (1826) 
153-157, (Oeuvres, I, 97-101). 

1827 Carl G. J. Jacobi, Uber die Hauptaxen der Flachen der Zweiten Ordnung, J. Reine Angew. 
Math., 2 (1827) 227-233, (Werke, III, 45-53). 

1829 Augustin-Louis Cauchy, Sur l’équation a l’aide de laquelle on détermine les inégalités séculaires 
des mouvements des planétes, Exercices de Mathématiques, Paris, 1829, (Oeuvres (2), IX, 
174-195). 


378 L. A. STEEN [April 


1830 Augustin-Louis Cauchy, Mémoire sur ’équation qui a pour racines les moments d’inertie 
principaux d’un corps solide et sur diverses équations du méme genre, Mém. Acad. Sci. Inst. 
France, 9(1830) 111-113, (Oeuvres, (1), II, 79-81). 

1837 Joseph Liouville, Sur le développement des fonctions... (second mémoire), J. Math. Pures Appl. 
2(1837) 16-35. 

1843 Arthur Cayley, Chapters in the analytical geometry of dimensions, Camb. Math. J., 4(1843), 
119-127, (Math. Papers, I, 55-62). 

1844 Hermann Grassmann, Die Ausdehnungslehre, Leipzig, 1844. 

1852 James Joseph Sylvester, A demonstration of the theorem that every homogeneous quadratic 
polynomial is reducible by real orthogonal substitution to the form of a sum of positive and 
negative squares, Phil. Mag., 4(1852) 138-142, (Math. Papers, I, 378-381). 

1855 Charles Hermite, Remarque sur un théoréme de M. Cauchy, C. R. Acad. Sci. Paris, 41 (1855) 
181-183. 

1858 Arthur Cayley, A memoir on the theory of matrices, Philos. Trans. Roy. Soc. London, 148 
(1858) 17-37, (Math. Papers, II, 475-496). 

1870 Theodor K6tteritzsch, Uber die Aufldsung eines Systems von unendlich vielen linearen Gleich- 
ungen, Z. Math. Physik, 15(1870) 1-15, 229-268. 

1877a Carl Neumann, Untersuchungen iiber des logarithmische und Newton sche Potential, Teubner, 
Leipzig, 1877. 

1877b George William Hill, On the Part of the Motion of the Lunar Perigee which is a Function of 
the Mean Motions of the Sun and Moon, John Wilson, Cambridge, Mass., 1877. 

1878 Georg Frobenius, Uber lineare Substitutionen und bilineare Formen, J. Reine Angew, Math., 
84 (1878) 1-63, (Gesammelte Abhandlungen, I, 343-405). 

1883 Jérgen Pederson Gram, Uber die Entwickelung realer Funktionen in Reihen mittelst der 
Methode der kleinsten Quadrate, J. Reine Angew. Math., 94 (1883) 41-73. 

1885a Paul Appell, Sur une méthode élémentaire pour obtenir les développements en série trigono- 
métrique des fonctions elliptiques, Bull. Soc. Math. France, 13(1885) 13-18. 

1885b Henri Poincaré, Remarques sur l’emploi de la méthode précédente, Bull. Soc. Math. France, 
13(1885) 19-27, (Oeuvres, V, 85-94). 

1886a George William Hill, On the part of the motion of the lunar perigee which is a function of 
the sun and the moon, Acta Mathematica, 8(1886) 1-36, (Coll. Math. Works, I, 243-270). 


1886b Henri Poincaré, Sur les déterminants d’ordre infini, Bull. Soc. Math. France, 14 (1886) 77-90, 
(Oeuvres, V, 95-107). 

1891 Helge von Koch, Sur une application des déterminants infinis 4 la théorie des équations différ- 
entielles iinéaires. Acta Mathematica, 15(1891) 53-63. 

1892 Helge von Koch, Sur les déterminants infinis et les équations différentielleslinéaires. Acta 
Mathematica, 16(1892) 217-295. 

1894 Thomas-Jean Stieltjes, Recherches sur les fractions continues. Ann. Fac. Sci. Toulouse, 8(1894) 
J. 1-122, (Oeuvres, II, 402-566). 

1900 Ivar Fredholm, Sur une nouvelle méthode pour la résolution du probléme de Dirichlet, Ofver. 
Vet. Akad. Férhand, Stockholm, 57 (1900) 39-46. 

1901 Henri Lebesgue, Sur une généralisation de l’intégrale définie, C. R. Acad. Sci. Paris, 132 (1901) 
1025-1028, 

1902 Léon Antonne, Sur l’Hermitien, Rend. Circ. Mat. Palermo, 16 (1902) 104-128. 

1903a Ivar Fredholm, Sur une classe d’équations fonctionnelles, Acta Mathematica, 27 (1903) 
365-390. 

1903b Jaques Hadamard, Sur les opérations fonctionnelles, C. R. Acad. Sci. Paris, 136 (1903) 351- 
354. 


1973] HIGHLIGHTS IN THE HISTORY OF SPECTRAL THEORY 379 


1904a David Hilbert, Grundziige einer allgemeinen Theorie der linearen Integralgleichungen, Erste 
Mitteilung, Gottingen Nachrichten, (1904) 49-91. 

1904b David Hilbert, Grundziige einer allgemeinen Theorie der linearen Integralgleichungen, 
Zweite Mitteilung, G6ttingen Nachrichten, (1904) 213-259. 

1904c Henri Lebesgue, Lecons sur l’intégration et la recherche des fonctions primitives, Gauthier- 
Villars, Paris, 1904. 

1905 David Hilbert, Grundziige einer allgemeinen Theorie der linearen Integralgleichungen, Dritte 
Mitteilung, G6ttingen Nachrichten, (1905) 307-338. 

1906a David Hilbert, Grundziige einer allgemeinen Theorie der linearen Integralgleichungen, 
Vierte Mitteilung, Géttingen Nachrichten, (1906) 157-227. 

1906b David Hilbert, Grundziige einer allgemeinen Theorie der linearen Integralgleichungen, 
Fiinfte Mitteilung, Géttingen Nachrichten, (1906) 439-480. 

1907a Erhard Schmidt, Zur Theorie der linearen und nichtlinearen Integralgleichungen, I, Math. 
Annalen, 63 (1907) 433-476. 

1907b Erhard Schmidt, Zur Theorie der linearen und nichtlinearen Integralgleichungen, II, Math. 
Annalen, 64 (1907) 161-174. 

1907c Friedrich Riesz, Sur les systémes orthogonaux de fonctions, C. R. Acad. Sci. Paris, 144 (1907) 
615-619. 

1907d Friedrich Riesz, Uber orthogonale Funktionensysteme, G6ttingen Nachrichten, (1907) 
116-122. 

1907c Ernst Fischer, Sur la convergence en moyenne, C. R. Acad. Sci. Paris, 144 (1907) 1022-1024. 

1908a Erhard Schmidt, Uber die Auflésung lineare Gleichungen mit unendlich vielen Unbekannten, 
Rend. Circ. Mat. Palermo, 25 (1908) 53-77. 

1908b Eliakim Hastings Moore, On a form of general analysis with application to linear differential 
and integral equations, Cong. Int. d. Math., Rome, 1908, 98-114. 

1908c George D. Birkhoff, Boundary value and expansion problems of ordinary linear differential 
equations, Trans. Amer. Math. Soc., 9 (1908) 373-395, (Coll. Papers, I, 14-36). 

1909a Ernst Hellinger, Neue Begriindung der Theorie quadratischer Formen von unendlichvielen 
Veranderlichen, J. Reine Angew. Math., 136 (1909) 210-271. 

1909b Friedrich Riesz, Sur les opérations fonctionnelles linéaires, C. R. Acad. Sci. Paris, 149 (1909) 
974-77. 

1909c Isaac Schur, Uber die charakterischen Wurzeln einer linearen Substitution mit enier Anwend- 
ung auf die Theorie der Integralgleichungen, Math. Annalen, 66 (1909) 488-510. 

1909d Hermann Weyl, Uber beschrankte quadratische Formen deren Differenz vollstetig ist, Rend. 
Circ. Mat. Palermo, 27 (1909) 373-392. 

1910a David Hilbert, Grundziige einer allgemeinen Theorie der linearen Integralgleichungen, Sechste 
Mitteilungen, Gdttingen Nachrichten, (1910) 355-419. 

1910b Ernst Hellinger and Otto Toeplitz, Grundlagen fiir eine Theorie der unendlichen Matrizen, 
Math. Annalen, 69 (1910) 289-330. 

1910c Friedrich Riesz, Untersuchungen tiber Systeme integrierbarer Funktionen, Math. Annalen, 
69 (1910) 449-497. 

1910d Helge von Koch, Sur les syst?mes d’une infinité d’équations linéaires a une infinité d’inconn- 
ues, C. R. Cong. d. Math., Stockholm, 1910, 43-61. 

1910e Ivar Fredholm, Les équations intégrales linéaires, C. R. Cong. d. Math. Stockholm, 1910, 
92-100. 

1910f Henri Lebesgue, Sur l’intégrale de Stieltjes et sur les opérations linéaires, C. R. Acad. Sci. Paris, 
150 (1910) 86-88. 

1912a David Hilbert, Grundziige einer allgemeinen Theorie der linearen Integralgleichungen, 
Teubner, Leipzig, 1912. 


380 L. A. STEEN [April 


1912b Eliakim Hastings Moore, On the foundations of the theory of linear integral equations, Bull. 
Amer. Math. Soc., 18 (1912) 334-362. 

1913a Frédéric Riesz, Les systémes d’équations linéaires 4 une infinité d’inconnues, Gauthier Villars, 
Paris, 1913. 

1913b Johann Radon, Theorie und Anwendungen der absolut additiven Mengenfunktionen, Sitz. 
Akad. Wiss., Wien, 122 (1913) 1295-1438. 

1918a Otto Toeplitz, Das algebraische Analogen zu einem Satze von Fejér, Math. Zeit., 2 (1918) 
187-197. 

1918b Friedrich Riesz, Uber lineare Funktionalgleichungen, Acta Mathematica, 41 (1918) 71-98. 

1923 Torsten Carleman, Sur les équations intégrales singuliéres 4 noyau réel et symmétrique, Uppsala 
Univ. Arsskrift, 1923. 

1925a Max Born and Pascual Jordan, Zur Quantenmechanik, Z. Physik, 34 (1925) 858-888. 

1925b Paul A. M. Dirac, The fundamental equations of quantum mechanics, Proc. Roy. Soc. 
London, Ser. A., 109 (1925) 642-658. 

1926a Max Born, Werner Heisenberg and Pascual Jordan, Zur Quantenmechanik, II, Z. Physik, 
35 (1926) 557-615. 

1926b Paul A. M. Dirac, On the theory of quantum mechanics, Proc. Roy. Soc. London, Ser. A., 
112 (1926) 661-677. 

1927 John von Neumann, Mathematische Begriindung der Quantenmechanik, Gdttingen Nach- 
richten (1927) 1-57, (Coll. Works, I, 151-207). 

1928a Ernst Hellinger and Otto Toeplitz, Integralgleichungen und Gleichungen mit unendlichvielen 
Unbekannten, Teubner, Leipzig, 1928. 

1928b Norbert Wiener, The spectrum of an arbitrary function, Proc. London Math. Soc., 27 (1928) 
483-496. 

1929a John von Neumann, Allgemeine Eigenwerttheorie Hermitescher Funktionaloperation, Math. 
Annalen, 102 (1929) 49-131, (Coll. Works, II, 3-85). 

1929b Aurel Wintner, Spektraltheorie der unendlichen Matrizen, Hirzel, Leipzig, 1929. 

1930a John von Neumann, Zur Algebra der Funktionaloperationen und Theorie der normalen 
Operatoren, Math. Annalen, 102 (1930) 370-427, (Coll. Works, II, 86-143). 

1930b Paul A. M. Dirac, The Principles of Quantum Mechanics, Oxford, 1930. 

1930c Frederich Riesz, Uber die linearen Transformationen des komplexen Hilbertschen Raumes, 
Acta Litt. Sci. (Szeged), 5 (1930) 23-54. 

1930b Norbert Wiener, Generalized harmonic analysis, Acta Mathematica, 55 (1930) 117-258. 


1931 T. H. Hildebrandt, Linear functional transformations in general spaces, Bull. Amer. Math. Soc., 
37 (1931) 185-212. 

1932a Marshall H. Stone, Linear Transformations in Hilbert Space, Amer. Math. Soc. Colloq. Publ. 
XV, New York, 1932. 

1932b Stefan Banach, Théorie des opérations linéaires, Warsaw, 1932. 

1934 Kurt O. Friedrichs, Spektraltheorie halbbeschrankter Operatoren und Anwendung auf die 
Spektralzerlegung von Differentialoperatoren, Math. Annalen, 109 (1934) 465-487, 685-713. 

1936a Francis J. Murray and John von Neumann, On rings of operators, I, Ann. Math., 37 (1936) 
116-229, (Von Neumann Coll. Works III, 6-119). 

1936b S. W. P. Steen, An introduction to the theory of operators, I, Proc. London Math. Soc., 41 
(1936) 361—392. 

1936c Mitio Nagumo, Einige analytische Untersuchungen in linearen metrischen Ringen, Jap. J. 
Math., 13(1936) 61-80. 

1937a Francis J. Murray and John von Neumann, On rings of operators, II, Trans. Amer. Math. 
Soc., 41 (1937) 208-248 (Von Neumann Coll. Works, III, 120-160). 


1973] HIGHLIGHTS IN THE HISTORY OF SPECTRAL THEORY 381 


1937b S. W. P. Steen, An introduction to the theory of operators, II, Proc. London Math. Soc., 43 
(1937) 529-543. 

1938a S. W. P. Steen, An introduction to the theory of operators, III, Proc. London Math. Soc., 
44 (1938) 398-441. 

1938b Stanislaw Mazur, Sur les anneaux linéaires, C. R. Acad. Sci. Paris, 207 (1938) 1025-1027. 

1939 S. W. P. Steen, An introduction to the theory of operators, [V; Proc. Camb. Phil. Soc., 35(1939) 
562-578. 

1940a John von Neumann, On rings of operators, III, Ann. Math., 41 (1940) 94-161, (Coll. Works, 
III, 161-228). 

1940b S. W. P. Steen, An introduction to the theory of operators, V, Proc. Camb. Phil. Soc., 36 (1940) 
139-149. 

1940c Marshall H. Stone, A general theory of spectra I, Proc. Nat. Acad. Sci. U. S. A., 26 (1940) 
280-283. 

1941a Israel M. Gelfand, Normierte Ringe, Mat. Sbornik, N. S., 9 (1941) 3-24. 

1941b Israel M. Gelfand and Georgii E. Silov, Uber verschiedene Methoden der Einfiihrung der 
Topologie in die Menge der Maximalen Ideale eines normierten Ringes, Mat. Sbornik, N. S., 
9 (1941) 25-38. 

1941c Marshall H. Stone, A general theory of spectra II, Proc. Nat. Acad. Sci. U. S. A., 27 (1941) 
83-87. 

1941d Shi uo Kakutani, Concrete representation of abstract (L) spaces and the mean ergodic 
theorem, Ann. Math., 42 (1941) 523-537. 

1942 Edgar R. Lorch, The spectrum of linear transformations, Trans. Amer. Math. Soc., 52 (1942) 
238-248. 

1943 Israel M. Gelfand and M. A. Naimark, On the imbedding of normed rings into the ring of 
operators in Hilbert space, Mat. Sbornik, N. S., 12 (1943) 197-213. 

1944 Hermann Weyl, David Hilbert and his mathematical work, Bull. Amer. Math. Soc., 50 (1944) 
612-654. 

1946 Charles E. Rickart, Banach algebras with an adjoint operation, Ann. Math., 47 (1946) 528-550. 

1949a John von Neumann, On Rings of Operators, Reduction Theory, Ann. Math., 50 (1949) 401- 
485, (Coll. Works, III, 400-484). 

1949b Arne Beurling, On the spectral synthesis of bounded functions, Acta Mathematica, 81 (1949) 
225-238. 

1951 Paul R. Halmos, Introduction to Hilbert Space, Chelsea, New York, 1951. 

1953 Richard G. Cooke, Linear Operators, Macmillan, London, 1953. 

1958a Nelson Dunford and Jacob T. Schwartz, Linear Operators I: General Theory, Interscience, 
New York, 1958. 

1958b Paul R. Halmos, Finite Dimensional Vector Spaces, Second Edition, Van Nostrand, Princeton, 
N. J., 1958. 

1962 Edgar R. Lorch, Spectral Theory, Oxford U. P., New York, 1962. 

1963 Nelson Dunford and Jacob T. Schwartz, Linear Operators II: Spectral Theory, Interscience, 
New York, 1963. 

1966 Michael Bernkopf, The Development of Function Spaces with Particular Reference to their 
Origins in the Integral Equation Theory, Arch. Hist. Exact. Sci., 3 (1966) 1-96. 

1968 Michael Bernkopf, A History of Infinite Matrices, Arch. Hist. Exact Sci., 4 (1968) 308-358. 

1970 Constance Reid, Hilbert, Springer-Verlag, Berlin, 1970. 

1971 Nelson Dunford and Jacob T. Schwartz, Linear Operators III: Spectral Operators, Wiley, 
New York, 1971. 


THE LEGEND OF JOHN VON NEUMANN 


P. R. HALMOS, Indiana University 


John von Neumann was a brilliant mathematician who made important contribu- 
tions to quantum physics, to logic, to meteorology, to war, to the theory and applica- 
tions of high-speed computing machines, and, via the mathematical theory of games 
of strategy, to economics. 


Youth. He was born December 28, 1903, in Budapest, Hungary. He was the 
eldest of three sons in a well-to-do Jewish family. His father was a banker who 
received a minor title of nobility from the Emperor Franz Josef; since the title was 
hereditary, von Neumann’s full Hungarian name was Margittai Neumann Janos. 
(Hungarians put the family name first. Literally, but in reverse order, the name means 
John Neumann of Margitta. The “‘of’’, indicated by the final “1’’, is where the “von” 
comes from; the place name was dropped in the German translation. In ordinary 
social intercourse such titles were never used, and by the end of the first world war 
their use had gone out of fashion altogether. In Hungary von Neumann is and always 
was known as Neumann Janos and his works are alphabetized under N. Incidentally, 
his two brothers, when they settled in the U.S., solved the name problem differently. 
One of them reserves the title of nobility for ceremonial occasions only, but, in 
daily life, calls himself Neumann; the other makes it less conspicuous by amalgama- 
ting it with the family name and signs himself Vonneuman.) 

Even in the city and in the time that produced Szilard (1898), Wigner (1902), 
and Teller (1908), von Neumann’s brilliance stood out, and the legends about him 
started accumulating in his childhood. Many of the legends tell about his memory. 
His love of history began early, and, since he remembered what he learned, he 
ultimately became an expert on Byzantine history, the details of the trial of Joan of 
Arc, and minute features of the battles of the American Civil War. 


Paul Halmos claims that he took up mathematics because he flunked his master’s orals in 
philosophy. 

He received his Univ. of Illinois Ph.D. under J.L. Doob. Then he was von Neumann’s assistant, 
followed by positions at Illinois, Syracuse, M. I. T. ’s Radiation Lab, Chicago, Michigan, Hawaii, 
and now is Distinguished Professor at Indiana Univ. He spent leaves at the Univ. of Uruguay, 
Montevideo, Univ. of Miami, Univ. of California, Berkeley, Tulane, and Univ. of Washington. 
He held a Guggenheim Fellowship and was awarded the MAA Chauvenet Prize. 

Professor Halmos’ research is mainly measure theory, probability, ergodic theory, topological 
groups, Boolean algebra, algebraic logic, and operator theory in Hilbert space. He has served on 
the Council of the AMS for many years and was Editor of the Proceedings of the AMS and Mathe- 
matical Reviews. His eight books, all widely used, include Finite-Dimensional Vector Spaces (Van 
Nostrand, 1958), Measure Theory (Van Nostrand, 1950), Naive Set Theory (Van Nostrand, 1960), 
and Hilbert Space Problem Book (Van Nostrand, 1967). 

The present paper is the original uncut version of a brief article commissioned by the Encyclo- 


paedia Britannica. Editor. 


382 


THE LEGEND OF JOHN VON NEUMANN 383 


He could, it is said, memorize the names, addresses, and telephone numbers in 
a column of the telephone book on sight. Some of the later legends tell about his 
wit and his fondness for humor, including puns and off-color limericks. Speaking 
of the Manhattan telephone book he said once that he knew all the numbers in it — 
the only other thing he needed, to be able to dispense with the book altogether, was 
to know the names that the numbers belonged to. 

Most of the legends, from childhood on, tell about his phenomenal speed in 
absorbing ideas and solving problems. At the age of 6 he could divide two eight- 
digit numbers in his head; by 8 he had mastered the calculus; by 12 he had read and 
understood Borel’s Théorie des Fonctions. 

These are some of the von Neumann stories in circulation. Ill report others, 
but I feel sure that I haven’t heard them all. Many are undocumented and unverifiable, 
but [ll not insert a separate caveat for each one: let this do for them all. Even the 
purely fictional ones say something about him; the stories that men make up about a 
folk hero are, at the very least, a strong hint to what he was like.) 

In his early teens he had the guidance of an intelligent and dedicated high-school 
teacher, L. Ratz, and, not much later, he became a pupil of the young M. Fekete and 
the great L. Fejér,“‘the spiritual father of many Hungarian mathematicians”. (“Fekete’’ 
means “‘Black’’, and “‘Fejér’’ is an archaic spelling, analogous to “‘Whyte’’.) 


According to von Karman, von Neumann’s father asked him, when John von 
Neumann was 17, to dissuade the boy from becoming a mathematician, for financial 
reasons. As a compromise between father and son, the solution von Karman propo- 
sed was chemistry. The compromise was adopted, and von Neumann studied che- 
mistry in Berlin (1921-1923) and in Ztirich (1923-1925). In 1926 he got both a Ziirich 
diploma in chemical engineering and a Budapest Ph.D. in mathematics. 


Early work. His definition of ordinal numbers (published when he was 20) is 
the one that is now universally adopted. His Ph.D. dissertation was about set theory 
too; his axiomatization has left a permanent mark on the subject. He kept up his 
interest in set theory and logic most of his life, even though he was shaken by K. 
Gédel’s proof of the impossibility of proving that mathematics is consistent. 

He admired Godel and praised himin strong terms: “Kurt Gédel’s achievement 
in modern logic is singular and monumental — indeed it is more than a monument, 
it is a landmark which will remain visible far in space and time. ... The subject of 
logic has certainly completely changed its nature and possibilities with Gédel’s 
achievement.” In a talk entitled ‘“The Mathematician’, speaking, among other things, 
of Gédel’s work, he said: “This happened in our lifetime, and I know myself 
how humiliatingly easily my own values regarding the absolute mathematical truth 
changed during this episode, and how they changed three times in succession!” 

He was Privatdozent at Berlin (1926-1929) and at Hamburg (1929-1930). During 
this time he worked mainly on two subjects, far from set theory but near to one another: 
quantum physics and operator theory. It is almost not fair to call them two 


384 P. R. HALMOS [April 


subjects: due in great part to von Neumann’s own work, they can be viewed as two 
aspects of the same subject. He started the process of making precise mathematics 
out of quantum theory, and (it comes to the same thing really) he was inspired by 
the new physical concepts to make broader and deeper the purely mathematical 
study of infinite-dimensional spaces and operators on them. The basic insight was 
that the geometry of the vectors in a Hilbert space has the same formal properties as 
the structure of the states of a quantum-mechanical system. Once that is accepted, 
the difference between a quantum physicist and a mathematical operator-theorist 
becomes one of language and emphasis only. Von Neumann’s book on quantum 
mechanics appeared (in German) in 1932. It has been translated into French (1947), 
Spanish (1949), and English (1955), and it is still one of the standard and one 
of the most inspiring treatments of the subject. Speaking of von Neumann’s 
contributions to quantum mechanics, E. Wigner, a Nobel laureate, said that they 
alone “would have secured him a distinguished position in present day theoretical 
physics”. 


Princeton. In 1930 von Neumann went to Princeton University for one term as 
visiting lecturer, and the following year he became professor there. In 1933, when the 
Institute for Advanced Study was founded, he was one of the original six professors 
of its School of Mathematics, and he kept that position for the rest of his life. (It 
is easy to get confused about the Institute and its formal relation with Princeton 
University, even though there is none. They are completely distinct institutions. 
The Institute was founded for scholarship and research only, not teaching. The first 
six professors in the School of Mathematics were J. W. Alexander, A. Einstein, M. 
Morse, O. Veblen, J. von Neumann, and H. Weyl. When the Institute began it had 
no building, and it accepted the hospitality of Princeton University. Its members and 
visitors have, over the years, maintained close professional and personal relations 
with their colleagues at the University. These facts kept contributing to the confu- 
sion, which was partly clarified in 1940, when the Institute acquired a building of its 
own, about a mile from the Princeton campus.) 

In 1930 von Neumann married Marietta K6vesi; in 1935 their daughter Marina 
was born. (In 1956 Marina von Neumann graduated from Radcliffe summa cum 
laude, with the highest scholastic record in her class. In 1972 Marina von Neumann 
Whitman was appointed by President Nixon to the Council of Economic Advisers.) 
In the 1930’s the stature of von Neumann, the mathematician, grew at the rate that 
his meteoric early rise had promised, and the legends about Johnny, the human 
being, grew along with it. He enjoyed life in America and lived it in an informal 
manner, very differently from the style of the conventional German professor. He 
was not a refugee and he didn’t feel like one. He was a cosmopolite in attitude and a 
U.S. citizen by choice. 

The parties at the von Neumanns’ house were frequent, and famous, and long. 
Johnny was not a heavy drinker, but he was far from a teetotaller. In a roadside 


1973] THE LEGEND OF JOHN VON NEUMANN 385 


restaurant he once ordered a brandy with a hamburger chaser. The outing was in 
honor of his birthday and he was feeling fine that evening. One of his gifts was a toy, 
a short prepared tape attached to a cardboard box that acted as sounding board; 
when the tape was pulled briskly past a thumbnail, it would squawk “‘Happy birthday!”’ 
Johnny squawked it often. Another time, at a party at his house, there was one of 
those thermodynamic birds that dips his beak in a glass of water, straightens up, 
teeter-totters for a while, and then repeats the cycle. A temporary but firm house 
rule was quickly passed: everyone had to take a drink each time that the bird did. 

He liked to drive, but he didn’t do it well. There was a ““von Neumann’s corner’’ 
in Princeton, where, the story goes, his cars repeatedly had trouble. One often quoted 
explanation that he allegedly offered for one particular crack-up goes like this: 
‘I was proceeding down the road. The trees on the right were passing me in orderly 
fashion at 60 miles an hour. Suddenly one of them stepped in my path. Boom!” 

He once had a dog named “Inverse”. He played poker, but only rarely, and he 
usually lost. 

In 1937 the von Neumanns were divorced; in 1938 he married Klara Dan. She 
learned mathematics from him and became an expert programmer. Many years later, 
in an interview, she spoke about him. “He has a very weak idea of the geography of 
the house. ...Once, in Princeton, I sent him to get me a glass of water; he came back 
after a while wanting to know where the glasses were. We had been in the house 
only seventeen years. ...He has never touched a hammer or a screwdriver; he does 
nothing around the house. Except for fixing zippers. He can fix a broken zipper with 
a touch.” 

Von Neumann was definitely not the caricatured college professor. He was a 
round, pudgy man, always neatly, formally dressed. There are, to be sure, one or 
two stories of his absentmindedness. Klari told one about the time when he left their 
Princeton house one morning to drive to a New York appointment, and then phoned 
her when he reached New Brunswick to ask: ““Why am I going to New York?’ 
It may not be strictly relevant, but I am reminded of the time I drove him to his 
house one afternoon. Since there was to be a party there later that night, and since I 
didn’t trust myself to remember exactly how I got there, I asked how I'd be able to 
know his house when I came again. ‘‘That’s easy,”’ he said; “‘it’s the one with that 
pigeon sitting by the curb.” 

Normally he was alert, good at rapid repartee. He could be blunt, but never stuffy, 
never pompous. Once the telephone interrupted us when we were working in his 
office. His end of the conversation was very short; all he said between “Hello” 
and ““Goodbye”’ was “‘Fekete pestis!’’, which means “Black plague!’ Remembering, 
after he hung up, that I understood Hungarian, he turned to me, half apologetic and 
half exasperated, and explained that he wasn’t speaking of one ofthe horsemen of the 
Apocalypse, but merely of some unexpected and unwanted dinner guests that his 
wife just told him about. 

On a train once, hungry, he asked the conductor to send the man with the sandwich 


386 P. R. HALMOS [April 


tray to his seat. The busy and impatient conductor said “I will if I see him”. Johnny’s 
reply: “This train is linear, isn’t it?” 


Speed. The speed with which von Neumann could think was awe-inspiring. G. 
Polya admitted that “Johnny was the only student I was ever afraid of. If in the 
course of a lecture I stated an unsolved problem, the chances were he’d come to me 
as soon as the lecture was over, with the complete solution in a few scribbles on a 
slip of paper.” Abstract proofs or numerical calculations — he was equally quick 
with both, but he was especially pleased with and proud of his facility with numbers. 
When his electronic computer was ready for its first preliminary test, someone suggest- 
ed a relatively simple problem involving powers of 2. (It was something of this kind: 
what is the smallest power of 2 with the property that its decimal digit fourth from 
the right is 7? This is a completely trivial problem for a present-day computer: it 
takes only a fraction of a second of machine time.) The machine and Johnny started 
at the same time, and Johnny finished first. 

One famous story concerns a complicated expression that a young scientist at 
the Aberdeen Proving Ground needed to evaluate. He spent ten minutes on the 
first special case; the second computation took an hour of paper and pencil work; 
for the third he had to resort to a desk calculator, and even so took half a day. 
When Johnny came to town, the young man showed him the formula and asked him 
what to do. Johnny was glad to tackle it. “Let’s see what happens for the first few 
cases. If we put n = 1, we get...» — and he looked into space and mumbled for a 
minute. Knowing the answer, the young questioner put in “2.31?” Johnny gave hima 
funny look and said ‘“‘Now if n = 2,...’’,and once again voiced some of his thoughts 
as he worked. The young man, prepared, could of course follow what Johnny was 
doing, and, a few seconds before Johnny finished, he interrupted again, in a hesitant 
tone of voice: “7.49?” This time Johnny frowned, and hurried on: “If = 3, then...”’. 
The same thing happened as before — Johnny muttered for several minutes, the 
young man eavesdropped, and, just before Johnny finished, the young man exclaimed: 
11.06!’ That was too much for Johnny. It couldn’t be! No unknown beginner 
could outdo him! He was upset and he sulked till the practical joker confessed. 


Then there is the famous fly puzzle. Two bicyclists start twenty miles apart and 
head toward each other, each going at a steady rate of 10 m.p.h. At the same time a 
fly that travels at a steady 15 m.p.h. starts from the front wheel of the southbound 
bicycle and flies to the front wheel of the northbound one, then turns around and flies 
to the front wheel of the southbound one again, and continues in this manner till he 
is crushed between the two front wheels. Question: what total distance did the fly 
cover ? The slow way to find the answer is to calculate what distance the fly covers on the 
first, northbound, leg of the trip, then on the second, southbound, leg, then on the 
third, etc., etc., and, finally, to sum the infinite series so obtained. The quick way is 
to observe that the bicycles meet exactly one hour after their start, so that the fly 
had just an hour for his travels; the answer must therefore be 15 miles. When the 


1973] THE LEGEND OF JOHN VON NEUMANN 387 


question was put to von Neumann, he solved it in an instant, and thereby disappoin- 
ted the questioner: ““Oh, you must have heard the trick before!” ‘What trick?” 
asked von Neumann; “all I did was sum the infinite series.” 

J remember one lecture in which von Neumann was talking about rings of operators. 
At an appropriate point he mentioned that they can be classified two ways: finite 
versus infinite, and discrete versus continuous. He went on to say: “‘This leads to a 
total of four possibilities, and, indeed, all four of them can occur. Or — let’s see — 
can they?’ Many of us in the audience had been learning this subject from him for 
some time, and it was no trouble to stop and mentally check off all four possibilities. 
No trouble — it took something like two seconds for each, and, allowing for some 
fumbling and shifting of gears, it took us perhaps 10 seconds in all. But after two 
seconds von Neumann had already said “‘Yes, they can,’”’ and he was two sentences 
into the next paragraph before, dazed, we could scramble aboard again. 


Speech. Since Hungarian is not exactly a lingua franca, all educated Hungarians 
must acquire one or more languages with a popular appeal greater than that of their 
mother tongue. At home the von Neumanns spoke Hungarian, but he was perfectly 
at ease in German, and in French, and, of course, in English. His English was fast 
and grammatically defensible, but in both pronunciation and sentence construction 
it was reminiscent of German. His “‘Sprachgefiihl’ was not perfect, and his sentences 
ten ded to become involved. His choice of words was usually exactly right; the occasion- 
al oddities (like “a self-obvious theorem”) disappeared in later years. His spelling 
was sometimes more consistent than commonplace: if “commit”, then ‘“ommit’’. 
S. Ulam tells about von Neumann’s trip to Mexico, where “‘he tried to make him- 
self understood by using ‘neo-Castilian’, a creation of his own — English words with 
an ‘el’ prefix and appropriate Spanish endings”. 

He prepared for lectures, but rarely used notes. Once, five minutes before a non- 
mathematical lecture to a general audience, I saw him as he was preparing. He sat 
in the lounge of the Institute and scribbled on a small card a few phrases such as 
these: ‘‘ Motivation, 5 min.; historical background, 15 min.; connection with econo- 
mics, 10 min.;...” 

As a mathematical lecturer he was dazzling. He spoke rapidly but clearly; he 
spoke precisely, and he covered the ground completely. If, for instance, a subject has 
four possible axiomatic approaches, most teachers content themselves with develop- 
ing one, or at most two, and merely mentioning the others. Von Neumann was fond 
of presenting the “complete graph” of the situation. He would, that is, describe the 
shortest path that leads from the first to the second, from the first to the third, and 
so on through all twelve possibilities. | 

His one irritating lecturing habit was the way he wielded an eraser. He would 
write on the board the crucial formula under discussion. When one of the symbols 
in it had been proved to be replaceable by something else, he made the replacement 
not by rewriting the whole formula, suitably modified, but by erasing the replaceable 


388 P. R. HALMOS [April 


symbol and substituting the new one for it. This had the tendency of inducing 
symptoms of acute discouragement among note-takers, especially since, to maintain 
the flow of the argument, he would keep talking at the same time. 


His style was so persuasive that one didn’t have to be an expert to enjoy his 
lectures; everything seemed easy and natural. Afterward, however, the Chinese- 
dinner phenomenon was likely to occur. A couple of hours later the average memory 
could no longer support the delicate balance of mutually interlocking implications, 
and, puzzled, would feel hungry for more explanation. 


Style. As a writer of mathematics von Neumann was clear, but not clean; he 
was powerful but not elegant. He seemed to love fussy detail, needless repetition, 
and notation so explicit as to be confusing. To maintain a logically valid but perfectly 
transparent and unimportant distinction, in one paper he introduced an extension of 
the usual functional notation: along with the standard #(x) he dealt also with 
something denoted by $((x)). The hair that was split to get there had to be split 
again a little later, and there was ¢(((x))), and, ultimately, ¢((((x)))). Equations 
such as 


Wa)” = ¢(@)) 


have to be peeled before they can be digested; some irreverent students referred to 
this paper as von Neumann’s onion. 


Perhaps one reason for von Neumann’s attention to detail was that he found it 
quicker to hack through the underbrush himself than to trace references and see 
what others had done. The result was that sometimes he appeared ignorant of the 
standard literature. If he needed facts, well-known facts, from Lebesgue integration 
theory, he waded in, defined the basic notions, and developed the theory to the 
point where he could use it. If, in a later paper, he needed integration theory again, 
he would go back to the beginning and do the same thing again. 


He saw nothing wrong with long strings of suffixes, and subscripts on subscripts; 
his papers abound in avoidable algebraic computations. The reason, probably, 
is that he saw the large picture; the trees did not conceal the forest from him. He 
saw and he relished all parts of the mathematics he was thinking about. He never 
wrote “down” to an audience; he told it as he saw it. The practice caused no harm; 
the main result was that, quite a few times, it gave lesser men an opportunity to 
publish “improvements” of von Neumann. 


Since he had no formal connections with educational institutions after he was 30, 
von Neumann does not have a long list of students; he supervised only one Ph.D. 
thesis. Through his lectures and informal conversations he acquired, however, 
quite a few disciples who followed in one or another of his footsteps. A few among 
them are J. W. Calkin, J. Charney, H. H. Goldstine, P. R. Halmos, I. Halperin, O. 
Morgenstern, F. J. Murray, R. Schatten, I. E. Segal, A. H. Taub, and S. Ulam. 


1973] THE LEGEND OF JOHN VON NEUMANN 389 


Work habits. Von Neumann was not satisfied with seeing things quickly and 
Clearly; he also worked very hard. His wife said “he had always done his writing at 
home during the night or at dawn. His capacity for work was practically unlimited.” 
In addition to his work at home, he worked hard at his office. He arrived early, 
he stayed late, and he never wasted any time. He was systematic in both large 
things and small; he was, for instance, a meticulous proofreader. He would correct a 
manuscript, record on the first page the page numbers where he found errors, and, 
by appropriate tallies, record the number of errors that he had marked on each of 
those pages. Another example: when requested to prepare an abstract of not more than 
200 words, he would not be satisfied with a statistical check — there are roughly 20 
lines with about 10 words each — but he would count every word. 

When I was his assistant we wrote one paper jointly. After the thinking and 
the talking were finished, it became my job to do the writing. I did it, and I submitted 
to him a typescript of about 12 pages. He read it, criticized it mercilessly, crossed out 
half, and rewrote the rest; the result was about 18 pages. I removed some of the 
Germanisms, changed a few spellings, and compressed it into 16 pages. He was far 
from satisfied, and made basic changes again; the result was 20 pages. The almost 
divergent process continued (four innings on each side as I now recall it); the final 
outcome was about 30 typescript pages (which came to 19 in print). 

Another notable and enviable trait of von Neumann’s was his mathematical 
courage. If, in the middle of a search for a counterexample, an infinite series came up, 
with a lot of exponentials that had quadratic exponents, many mathematicians 
would start with a clean sheet of paper and look for another counterexample. Not 
Johnny! When that happened to him, he cheerfully said: ““Oh, yes, a theta function...”’, 
and plowed ahead with the mountainous computations. He wasn’t afraid of anything. 

He knew a lot of mathematics, but there were also gaps in his knowledge, most 
notably number theory and algebraic toplogy. Once when he saw some of us at a 
blackboard staring at a rectangle that had arrows marked on each of its sides, he 
wanted to know that what was. “Oh just the torus, you know — the usual identification 
convention.” No, he didn’t know. The subject is elementary, but some of it just 
never crossed his path, and even though most graduate students knew about it, he 
didn’t. 

Brains, speed, and hard work produced results. In von Neumann’s Collected 
Works there is a list of over 150 papers. About 60 of them are on pure mathematics 
(set theory, logic, topolog'cal groups, measure theory, ergodic theory, operator 
theory, and continuous geometry), about 20 on physics, about 60 on applied mathema- 
tics (including statistics, game theory, and computer theory), and a small handful 
on some special mathematical subjects and general non-mathematical ones. A special 
number of the Bulletin of the American Mathematical Society was devoted to a discus- 
sion of his life and work (in May 1958). 


Pure mathematics. Von Neumann’s reputation as a mathematician was firmly 


390 P. R. HALMOS [April 


established by the 1930’s, based mainly on his work on set theory, quantum theory, 
and operator theory, but enough more for about three ordinary careers, in pure mathe- 
matics alone, was still to come. The first of these was the proof of the ergodic theorem. 
Various more or less precise statements had been formulated earlier in statistical 
mechanics and called the ergodic hypothesis. In 1931 B. O. Koopman published a 
penetrating remark whose main substance was that one of the contexts in which 
a precise statement of the ergodic hypothesis could be formulated is the theory of 
operators on Hilbert space — the very subject that von Neumann used earlier to make 
quantum mechanics precise and on which he had written several epoch-making 
papers. It is tempting to speculate on von Neumann’s reaction to Koopman’s paper. 
It could have been something like this: “By Koopman’s remark the ergodic 
hypothesis becomes a theorem about Hilbert spaces — and if that’s what it is I ought 
to be able to prove it. Let’s see now....’’ Soon after the appearance of Koopman’s 
paper, von Neumann formulated and proved the statement that is now known as the 
mean ergodic theorem for unitary operators. There was some temporary confusion, 
caused by publication dates, about who did what before whom, but by now it is 
universally recognized that von Neumann’s theorem preceded and inspired G. D. 
Birkhoff’s point ergodic theorem. In the course of the next few years von Neumann 
published several more first-rate papers on ergodic theory, and he made use 
of the techniques and results of that theory later, in his studies of rings of operators. 

In 1900 D. Hilbert presented a famous list of 23 problems that summarized the 
State of mathematical knowledge at the time and showed where further work was 
needed. In 1933 A. Haar proved the existence of a suitable measure (which has come 
to be called Haar measure) in topological groups; his proof appears in the Annals of 
Mathematics. Von Neumann had access to Haar’s result before it was published, and 
he quickly saw that that was exactly what was needed to solve an important special 
case (compact groups) of one of Hilbert’s problems (the 5th). His solution appears 
in the same issue of the same journal, immediately after Haar’s paper. 

In the second half of the 1930’s the main part of von Neumann’s publications was 
a sequence of papers, partly in collaboration with F. J. Murray, on what he called 
rings of operators. (They are now called von Neumann algebras.) It is possible 
that this is the work for which von Neumann will be remembered the longest. It 
is a technically brilliant development of operator theory that makes contact with von 
Neumann’s earlier work, generalizes many familiar facts about finite-dimensional 
algebra, and is currently one of the most powerful tools in the study of quantum 
physics. 

A surprising outgrowth of the theory of rings of operators is what von Neumann 
called continuous geometry. Ordinary geometry deals with spaces of dimension 
1, 2, 3, etc. In his work on rings of operators von Neumann saw that what really 
determines the dimension structure of a space is the group of rotations that it admits. 
The group of rotations associated with the ring of all operators yields the familiar 
dimensions. Other groups, associated with different rings, assign to spaces dimensions 


1973] THE LEGEND OF JOHN VON NEUMANN 391 


whose values can vary continuously; in that context it makes sense to speak of a space 
of dimension 3/4, say. Abstracting from the “concrete” case of rings of operators, 
von Neumann formulated the axioms that make these continuous-dimensional 
Spaces possible. For several years he thought, wrote, and lectured about continuous 
geometries. In 1937 he was the Colloquium Lecturer of the American Mathematical 
Society and chose that subject for his topic. 


Applied mathematics. The year 1940 was just about the half-way point of von 
Neumann’s scientific life, and his publications show a discontinuous break then. 
Till then he was a topflight pure mathematician who understood physics; after 
that he was an applied mathematician who remembered his pure work. He be- 
came interested in partial differential equations, the principal classical tool of the 
applications of mathematics to the physical world. Whether the war made him into an 
applied mathematician or his interest in applied mathematics made him invaluable 
to the war effort, in either case he was much in demand as a consultant and advisor to 
the armed forces and to the civilian agencies concerned with the problems of war. 
His papers from this point on are mainly on statistics, shock waves, flow problems, 
hydrodynamics, aerodynamics, ballistics, problems of detonation, meteorology, and, 
last but not least, two non-classical, new aspects of the applicability of mathematics 
to the real world: games and computers. 

Von Neumann’s contributions to war were manifold. Most often mentioned is his 
proposal of the implosion method for bringing nuclear fuel to explosion (during 
World War IJ) and his espousal of the development of the hydrogen bomb (after 
the war). The citation that accompanied his honorary D.Sc. from Princeton in 
1947 mentions (in one word) that he was a mathematician, but praises him for 
being a physicist, an engineer, an armorer, and a patriot. 


Politics. His political and administrative decisions were rarely on the side that is 
described nowadays by the catchall term “liberal’’. He appeared at times to advocate 
preventive war with Russia. As early as 1946 atomic bomb tests were already receiv- 
ing adverse criticism, but von Neumann thought that they were necessary and (in, 
for instance, a letter to the New York Times) defended them vigorously. He disagreed 
with J. R. Oppenheimer on the H-bomb crash program, and urged that the U.S. 
proceed with it before Russia could. He was, however, a “‘pro-Oppenheimer”’’ witness 
at the Oppenheimer security hearings. He said that Oppenheimer opposed the program 
‘in good faith” and was “‘very constructive’ once the decision to go ahead with the 
super bomb was made. He insisted that Oppenheimer was loyal and was not a 
security risk. 

As a member of the Atomic Energy Commission (appointed by President 
Eisenhower, he was sworn in on March 15, 1955), having to “‘think about the unthink- 
able’’, he urged a United Nations study of world-wide radiation effects. ““We willingly 
pay 30,000-40,000 fatalities per year (2% of the total death rate),” he wrote, “for 
the advantages of individual transportation by automobile.” He mentioned a 


392 P. R. HALMOS [April 


fall-out accident in an early Pacific bomb test that resulted in one fatality and danger 
to 200 people, and he compared it with a Japanese ferry accident that “killed about 
1,000 people, including 20 Americans — yet the...fall-out was what attracted almost 
world-wide attention.” He asked: “Is the price in international popularity worth 
paying?’ And he answered: “‘Yes: we have to accept it as part payment for our more 
advanced industrial position.” 


Game theory. At about the same time that he began to apply his analytic talents to 
the problems of war, von Neumann found time and energy to apply his combinatorial 
insight to what he called the theory of games, whose major application was to econo- 
mics. The mathematical cornerstone of the theory is one statement, the so-called 
minimax theorem, that von Neumann proved early (1928) in a short article (25 pages); 
its elaboration and applications are in the book he wrote jointly with O. Morgenstern 
in 1944. The minimax theorem says about a large class of two-person games that 
there is no point in playing them. If either player considers, for each possible strategy 
of play, the maximum loss that he can expect to sustain with that strategy, and then 
chooses the “‘optimal’’ strategy that minimizes the maximum loss, then he can be 
statistically sure of not losing more than that minimax value. Since (and this is the 
whole point of the theorem) that value is the negative of the one, similarly defined, 
that his opponent can guarantee for himself, the long-run outcome is completely 
determined by the rules. 

Mathematical economics before von Neumann tried to achieve success by imita- 
ting the technique of classical mathematical physics. The mathematical tools used were 
those of analysis (specifically the calculus of variations), and the procedure relied on a 
not completely reliable analogy between economics and mechanics. The secret of the 
success of the von Neumann approach was the abandonment of the mechanical 
analogy and its replacement by a fresh point of view (games of strategy) and new 
tools (the ideas of combinatorics and convexity). 

The role that game theory will play in the future of mathematics and economics 
is not easy to predict. As far as mathematics is concerned, it is tenable that the only 
thing that makes the Morgenstern-von Neumann book 600 pages longer than the 
Original von Neumann paper is the development needed to apply the abstruse deduc- 
tions of one subject to the concrete details of another. On the other hand, enthusiastic 
proponents of game theory can be found who go so far as to say that it may be “‘one of 
the major scientific contributions of the first half of the 20th century’. 


Machines. The last subject that contributed to von Neumann’s fame was the 
theory of electronic computers and automata. He was interested in them from every 
point of view: he wanted to understand them, design them, build them, and use them. 
What are the logical components of the processes that a computer will be asked to 
perform? What is the best way of obtaining practically reliable answers from a machine 
with unreliable components? What does a machine need to “‘remember’’, and what 
is the best way to equip it with a “memory”? Can a machine be built that can not 


1973] THE LEGEND OF JOHN VON NEUMANN 393 


only save us the labor of computing but save us also the trouble of building a new 
machine — is it possible, in other words, to produce a self-reproducing automaton? 
(Answer: in principle, yes. A sufficiently complicated machine, embedded in a thick 
chowder of randomly distributed spare parts, its ‘‘food’’, would pick up one part after 
another till it found a usable one, put it in place, and continue to search and construct 
till its descendant was complete and operational.) Can a machine successfully imi- 
tate “‘randomness’’, so that when no formulae are available to solve a concrete 
physical problem (such as that of finding an optimal bombing pattern), the 
machine can perform a large number of probability experiments and yield an answer 
that is statistically accurate? (The last question belongs to the concept that is some- 
times described as the Monte Carlo method.) These are some of the problems that 
von Neumann studied and to whose solutions he made basic contributions. 


He had close contact with several computers — among them the MANIAC 
(Mathematical Analyzer, Numerical Integrator, Automatic Calculator), and the 
affectionately named JONIAC. He advocated their use for everything from the 
accumulation of heuristic data for the clarification of our intuition about partial 
differential equations to the accurate long-range prediction and, ultimately, control 
of the weather. One of the most striking ideas whose study he suggested was to dye 
the polar icecaps so as to decrease the amount of energy they would reflect — the 
result could warm the earth enough to make the climate of Iceland approximate 
that of Hawaii. 

The last academic assignment that von Neumann accepted was to deliver and 
prepare for publication the Silliman lectures at Yale. He worked on that job in the 
hospital where he died, but he couldn’t finish it. His notes for it were published, and 
even they make illuminating reading. They contain tantalizing capsule statements of 
insights, and throughout them there shines an attitude of faith in and dedication 
to knowledge. While physicists, engineers, meteorologists, statisticians, logicians, 
and computers all proudly claim von Neumann as one of theirs, the Silliman lectures 
prove, indirectly by their approach and explicitly in the author’s words, that von 
Neumann was first, foremost, and always a mathematician. 


Death. Von Neumann was an outstanding man in tune with his times, and it 
is not surprising that he received many awards and honors. There is no point in 
listing them all here, but a few may be mentioned. He received several honorary 
doctorates, including ones from Princeton (1947), Harvard (1950), and Istanbul 
(1952). He served a term as president of the American Mathematical Society (1951- 
1953), and he was a member of several national scientific academies (including, 
of course, that of the U. S.). Somewhat to his embarrassment, he was elected to the 
East German Academy of Science, but the election didn’t seem to take — in later 
years no mention is made of it in the standard biographical reference works. 
He received the Enrico Fermi award in 1956, when he already knew that he was 


incurably iil. 


394 P. R. HALMOS 


Von Neumann became ill in 1955. There was an operation, and the result was a 
diagnosis of cancer. He kept on working, and even travelling, as the disease progressed. 
Later he was confined to a wheelchair, but still thought, wrote, and attended meetings. 
In April 1956 he entered Walter Reed Hospital, and never left it. Of his last days his 
good friend Eugene Wigner wrote: “When von Neumann realized he was incurably 
ill, his logic forced him to realize that he would cease to exist, and hence cease to have 
thoughts. ...It was heartbreaking to watch the frustration of his mind, when all 
hope was gone, in its struggle with the fate which appeared to him unavoidable but 
unacceptable.” 

Von Neumann was baptized a Roman Catholic (in the U. S.), but, after his 
divorce, he was not a practicing member of the church. In the hospital he asked to see 
a priest — ‘“‘one that will be intellectually compatible’. Arrangements were made, 
he was given special instruction, and, in due course, he again received the sacraments. 
He died February 8, 1957. 

The heroes of humanity are of two kinds: the ones who are just like all of us, but 
very much more so, and the ones who, apparently, have an extra-human spark. We 
can allrun, and some of us can run the mile in less than 4 minutes; but there is noth- 
ing that most of us can do that compares with the creation of the Great G-minor 
Fugue. Von Neumann’s greatness was the human kind. We can all think clearly, 
more or less, some of the time, but von Neumann’s clarity of thought was orders of 
magnitude greater than that of most of us, all the time. Both Norbert Wiener and 
John von Neumann were great men, and their names will live after them, but for 
different reasons. Wiener saw things deeply but intuitively; von Neumann saw 
things clearly and logically. 

What made von Neumann great? Was it the extraordinary rapidity with which 
he could understand and think and the unusual memory that retained everything 
he had once thought through? No. These qualities, however impressive they might 
have been, are ephemeral; they will have no more effect on the mathematics and the 
mathematicians of the future than the prowess of an athlete of a hundred years ago 
has on the sport of today. 

The ‘‘axiomatic method’’ is sometimes mentioned as the secret of von Neumann’s 
success. In his hands it was not pedantry but perception; he got to the root of the 
matter by concentrating on the basic properties (axioms) from which all else follows. 
The method, at the same time, revealed to him the steps to follow to get from the 
foundations to the applications. He knew his own strengths and he admired, per- 
haps envied, people who had the complementary qualities, the flashes of irrational 
intuition that sometimes change the direction of scientific progress. For von Neumann 
it seemed to be impossible to be unclear in thought or in expression. His insights 
were illuminating and his statements were precise. 


ALTERNATING EULER PATHS FOR PACKINGS AND COVERS 
C. T. ZAHN, Jr., Stanford University 


1. Introduction. An interesting combinatorial problem known as the “school- 
girls’ walk” asks if the girls in an all-girl school can take a walk in two-by-two 
fashion so that each pair walking side by side are on friendly terms, it being known 
which pairs are friendly among all possible pairings. If such a utopian arrangement 
is not possible, then what is the largest number of friendly pairings that can be 
achieved simultaneously and how can such an optimal set of pairings be found? 
This problem is abstractly equivalent to a problem in graph theory which is as 
follows: Let G be a finite graph with vertex set V and edge set E; a matching M of 
graph G is a subset of EF such that no two edges in M have a vertex in common. A 
matching M* is maximum if no other matching has more edges than M*. A matching 
P is perfect if each vertex in V belongs to an edge of P. In this abstract version the 
vertices represent girls and edges represent pairs of friendly girls. A matching is a 
pairing of friendly girls with no girl appearing twice, an obviously necessary require- 
ment. A perfect matching represents the utopian arrangement and a maximum 
matching achieves the largest number of friendly pairings. 

A related problem concerns a minimum cover for a graph G where a cover C is a 
subset of E such that every vertex belongs to at least one edge in C. A perfect 
matching is then a matching which is also a cover and it is easy to see that any subset 
of edges which is both a matching and a cover is necessarily a maximum matching 
and a minimum cover. 

A good algorithm for finding a maximum matching requires a reasonably simple 
condition which, when true, assures that a matching M is maximum and, when false, 
implies M is not maximum, and further indicates how to modify M to obtain a 
larger matching M’. Such a condition is afforded by augmenting paths. A path in G 
is a sequence of edges such that two edges adjacent in the sequence share a vertex 
in G, For our purposes, no edge can appear more than once in a path. If Sis a subset 
of edges in G, an S-alternating path is a path whose edges are alternately in S and in 
S = E—S. A vertex v is S-exposed if v belongs to no edge in S. An M-augmenting 
path for matching M is an M-alternating path whose end vertices are M-exposed. 
Notice this implies the end edges are in M and the path has odd length. Such a path 
is called augmenting because by interchanging the M and M status of the edges of 
the path, a new matching M’ results with | M’| = | M@| + 1. Hence, the existence 


Charles Zahn studied at Princeton Univ., the Catholic Univ. of America, and the Univ. of 
Wisconsin. He has worked with the Applied Math Div. of NBS, the General Electric Co. — Com- 
puter Department, and presently with the Computation Group of the Stanford Linear Accelerator 
Center. He is a member of the MAA and the ACM, and his main research interest is picture 
processing and pattern recognition. Editor. 


395 


396 C. T. ZAHN [April 


of an M-augmenting path implies M is not maximum. What is not so obvious is 
that the non-existence of an M-augmenting path implies M is maximum. This 
result, first obtained by Berge [1] means that the non-existence of an augmenting 
path is a condition which can be used to find a maximum matching. Interest in an 
efficient algorithm stems from several interesting practical problems which can be 
formulated as optimum matching or cover problems (see [2], [3], [4, p. 177] for 
details). 

An analogous situation holds for minimum covers. A vertex v is S-doubled 
for a subset S of edges if v belongs to at least two edges of S. A C-reducing path for 
a cover C is a C-alternating path whose end edges are in C and whose end vertices 
are C-doubled. Once again a C-reducing path leads to a new cover C’ with | C’| = 
IC — 1. Furthermore, Norman and Rabin [2] have shown that the non-existence 
of a C-reducing path implies C is minimum, leading to an algorithm for finding a 
minimum cover. They also show that a maximum matching M* can be obtained 
from a minimum cover C* by deleting all but one C*-edge from each C*-doubled 
vertex. Adding an edge to M* to cover each M*-exposed vertex of a maximum 
matching M* produces a minimum cover C*. Edmonds [3] has generalized the 
theorems of Berge [1] and Norman-Rabin [2] by replacing edges (2-element 
subsets) with more general subsets of vertices whereupon the appropriate im- 
provement structures become trees in a certain graph. 


2. Packings and covers. Here we are concerned with a different generalization 
of the matching and cover problems. Let 6 be a function which assigns a non- 
negative integer to each vertex v of G. If d(v), the local degree of v, is the number 
of edges to which v belongs and 6(v) S d(y) for all v in V then 6 is called a local 
degree constraint on G. A 6-packing in G is a subset P of edges such that each vertex 
vin V belongs to, at most, d(v) edges in P. A 6-cover C is a subset of edges, such that 
each vertex v belongs to at least d(v) edges in C. In this terminology, a matching 
is a 1-packing (i.e., a 6-packing with 6=1) and a cover is a l-cover. Optimum 
6-packings and 6-covers are defined in the obvious way. 

There is a strong duality between 6-packings and 6-covers which does not 
exist between matchings and covers. If 6 is a local degree constraint on G, then 
6 =d-— 6 is also a local degree constraint on G and we say 6 and 6 are complemen- 
tary. It is not hard to see that a subset S of edges of G is a 6-packing if and only if 
is a 6-cover. Let v be a vertex and S be a subset of edges; then v is (‘S, 5)-deficient; 
if v belongs to less than 6(v) edges in S and is (S, 6)-surfeited if v belongs to 
more than 6(v) edges in S. A (P, 6)-augmenting path for 6-packing P is a P-alter- 
nating path whose end edges are in P and whose end vertices are (P, 6)-deficient 
a (C, 6)- reducing path for 6-cover C is a C-alternating path whose end edges are 
in C and whose end vertices are (C, 6)-surfeited. In case the end vertices of an aug- 
menting (reducing) path are identical, the deficiency (surfeit) is required to be at 
least two. Now the duality means that path z in G is (P, 6)-augmenting for 6-packing 


1973] ALTERNATING EULER PATHS FOR PACKINGS AND COVERS 397 


P if and only if z is (P, 6)-reducing for 5-cover P. It is also clear that P is a maxi- 
mum 06-packing if and only if C = P is a minimum 6-cover. 

The theorem of Berge-Norman-Rabin proved by Berge [4, p. 175] asserts that 
the non-existence of a (P, 6)-augmenting path for 6-packing P implies P is maxi- 
mum. Using duality, we see immediately that non-existence of a (C, 6)-reducing 
path for 6-cover C insures C is minimum. Goldman [5] has proved the B-N-R 
theorem on augmentable 6-packings by a direct reduction to the theorem of Berge 
[1] on augmentable matchings (1-packings). The main result of this paper is a 
simple direct proof of the B-N-R theorem using ideas from Edmonds’ simple 
proof [6] of the original augmenting path theorem of Berge [1] and the notion 
of Euler path [7] which dates back to 1736. 

As regards design of algorithms, Edmonds [6] has found an exceptionally effi- 
cient algorithm for determining a maximum matching by growing alternating-path 
trees and occasionally shrinking odd-length cyclic paths until an augmenting path 
is discovered or the edges of the graph have been depleted. Witzgall and Zahn [8] 
have devised a modified version of the Edmonds algorithm which does not shrink 
and Edmonds [9] has extended his algorithm to the case where edges have real- 
valued weights and maximum is defined accordingly. 

The reader is referred to Berge [4] and Ore [10] for more leisurely discussions 
of matchings and coverings in graphs. Alternating paths were invented by Petersen 
[11] in the last century and augmenting paths for 6-packings occur in Tutte’s [12] 
‘paper on f-factors (perfect f-packings) in a graph. 


3. Euler paths. One of the earliest problems in graph theory was posed and 
solved by Euler [7] in the year 1736. The problem is known as the“‘KGnigsberg 
bridge problem”’ and intrigued the inhabitants of this Prussian town until solved 
by Euler. We quote Euler’s [7] statement of the problem: 

In the town of K6nigsberg in Prussia there is an island A, called “‘“Kneiphof”, 

with the two branches of the river (Pregel) flowing around it, as shown in Fig- 

ure 1. There are seven bridges, a, b, c, d, e, f and g, crossing the two branches. 

The question is whether a person can plan a walk in such a way that he will 

cross each of these bridges once but not more than once. 

Euler recognized the combinatorial nature of the problem and his solution 
can be phrased in terms of the graph in Figure 1. An Euler path in a graph Gis a 
path containing each edge of G exactly once. Euler showed that such a path is 
possible if and only if no more than 2 vertices of G have odd local degree and G 
is connected (an obviously necessary condition). Hence, the answer to the KGnigs- 
berg bridge problem is negative, there being 4 odd vertices. If the graph has exactly 
two odd vertices a and b, then an Euler path must have a and bas end vertices. If no 
vertices are odd, the graph contains a closed Euler path. 

A constructive, algorithmic demonstration of the Euler path result borrowing 
from [4, p. 165] and [10, p. 39] is as follows: Suppose graph G has exactly two 


398 Cc. T. ZAHN [April 


C 
g 
C 
A D 
Q 
f 
B 
MAP GRAPH 


Fic. 1 


odd vertices a and b. Start growing a path z, at vertex a and continue it as far 
as possible without repeating any edges. This path cannot get stuck at an even 
vertex because each time a vertex is crossed by the path two of its edges are used; 
furthermore, the path cannot stop at a because the first edge of the path is adjacent 
to vertex a leaving an even number of unused edges for subsequent crossings through 
a. Therefore, path z, stops at the only other odd vertex, b. If 2, includes all edges 
of G, then it is the required Euler path. If not, we delete the edges of x, from G 
and obtain a new graph G’ whose local degrees are all even since 7, meets each vertex 
through an even number (possibly zero) of edges with the exception of a and b. The 
connectedness of G implies there is a vertex ¢ on 7, which is contained in an edge 
of G’. We construct a path z, in G’ starting at c, which must return to ec because all 
local degrees of G’ are even. Enlarge 2, to the “‘spliced”’ path 2,(a, c)/z,/7,(e, b) and 
repeat the process until z, contains all edges of G. It can be shown by a similar 
argument [10, p. 40] that a graph with 2N>0 odd vertices can be covered by 
exactly N paths. More detailed discussions on Euler paths can be found in Ore 
[10] and Berge [4]. 


4, Edmonds’ lemma. Edmonds [6, p. 453] gave a one-sentence proof of the 
theorem of Berge [1] based on the following lemma: 


LEMMA E. Let M, and M, be matchings in graph G and let M, + M, denote 
the set of edges in M, or M, but not both. Then the subgraphG,, formed by M, + M, 
has connected components which are paths and circuits, each of which is M,- 
alternating as well as M,-alternating. Each end vertex of these paths is either 
M,-exposed or M,-exposed. 


Proof. No vertex of G,, has local degree greater than two since a vertex can meet 
at most one M, edge and one M, edge. Hence, the graph G,, consists entirely of 
paths and circuits. Let a be an end vertex of one of the paths in G,, and suppose the 
adjacent end edge belongs to M,7 M,. Since M, is a matching, a is not adjacent 


1973] ALTERNATING EULER PATHS FOR PACKINGS AND COVERS 399 


to any other M, edge. Any M, edge adjacent to a would therefore be in M, + M, 
contradicting the fact that a is an end vertex of G,,. We conclude that a is M,- 
exposed. 

To prove a non-maximum matching M contains an M-augmenting path is 
now a matter of simple arithmetic! If M’ is a matching larger than M, then some 
component of the subgraph M+ M’ must contain more M’ edges than M edges 
implying the end edges are in M and M-exposed. 

The natural decomposition of M, + M, into alternating paths depends heavily 
on the special ‘‘oneness’’ of matchings (1-packings). An analogous result for general 
6-packings requires some extra device for generating the alternating paths. The Euler 
paths of Section 3 supply an adequate mechanism for this purpose. 


5. Alternating Euler paths. We begin this section by applying the Euler path 
idea to prove a generalization of Edmonds’ Lemma E. We have used the notation 
d(v) to represent the local degree of vertex vin graph G. In what follows, we shall 
be concerned with the various local degrees of a single vertex v in various subgraphs 
H; of G and we use d(v,H;) to denote the number of edges of H; which contain v. 


LEMMA EEZ. Let H, and H, be subgraphs of G which have no common edges. 
Then the subgraph G,, generated by H, UH, can be decomposed into an edgewise 
disjoint family of paths whose edges alternate between H, and H,, such that each 
path is one of the following types: 

(1) Closed paths of even length. 

(2) Closed paths of odd length such that the unique vertex v incident to two 
adjacent edges of the same H; satisfies d(v,H;) — d(v,H,) 2 2 where j # i. 

(3) Non-closed paths such that if e is an end edge in H, containing end vertex 
v, then d(v,H;) — d(v,H;) 2 1, where j # i. 


Proof. Let A,,(v) = d(v,H,) — d(v,H,) for all vertices in G,, and call vertex v 
balanced, positive or negative according as A,,(v) is zero, positive or negative. If all 
vertices of G,, are balanced, then each connected component of G,, enjoys the same 
property and hence contains an Euler circuit (a closed Euler path). Furthermore, 
since d(v,H,) = d(v,H,) for each vertex, the Euler path can be chosen to alternate 
between edges of H, and H,, and so becomes a path of type 1. 

If G,, contains unbalanced vertices, let v, be one such and, for convenience, 
suppose it to be positive. The argument is similar for negative vertices. 

Since vy, is positive d(v,,H,) — d(v,,H,) 2 1 and, therefore, d(v,,H,) 2 1. Let e, 
be one of the edges of H, adjacent to v, and let v, be the other vertex of e,. We 
select an edge e, among the H, edges at v, and add it to the path (if there exists such 
an edge). This process is continued as long as the path alternates between H, and H, 
and does not use the same edge twice. The finiteness of the graph ensures termina- 
tion; this can happen in several ways. 

If the path terminates at v, via edge e,, in H,, then v, # v, since the positiveness 


400 C. T. ZAHN [April 


of v, ensures that every time we enter that vertex via an H, edge, there will be an 
unused H, edge available for exit. Since the path uses up equal numbers of H, and 
H, edges at each vertex it crosses (except v, and v,), we can be forced to stop at 
v, after edge e,, in H, only if v, is negative. In this case, the path {e,,e,,---, e2,} 
is of type 3. Deleting this path from G,, produces a new graph G’,, in which the 
path tracing can be resumed. Had v, been a negative vertex, the even length path 
would have ended at a positive vertex v,. In either case, the deletion of such a path 
may create some balanced vertices but never any positive or negative ones. 

If the path from positive v, ends at v, via edge e2,,, in H,, then either v, = v, 
and A,,(v,) 2 2 or else v, 4 v, and vy, is positive. The first case gives a path of type 2 
and the second case one of type 3. Similar results hold if v, is negative and once 
again the paths can be deleted and the path tracing resumed in the reduced graph. 
When we arrive at a reduced graph with no unbalanced vertices, we decompose each 
connected component into a path of type 2, as indicated earlier in the proof. The 
path tracing must terminate for lack of edges or unbalanced vertices so the lemma 
is proved. 

We call the paths of Lemma EEZ alternating Euler paths. 


CoROLLARY 1. If «, is the number of odd-length paths (possibly closed) with more 
H, edges than H,, and similarly for «,, then 


Ky — O2 = |H,| —~ |H,| =4 2 A,2(¥). 


Proof. Even-length alternating paths have equal numbers of edges from H, and 
H, and so contribute nothing to the expression |H,| _ |H,|. Each odd-length 
alternating path has exactly one more H, edge than it has H, edges or vice versa 
thereby contributing +1 to |H,| — |H,|. Because each edge is counted twice when 
local vertex degrees are summed over all vertices 

x A,(v) = ZX dv,Hi)- =X d(v,H,) = 2|H,| — 2]H,|. 
VEG12 ve G12 ve G12 

CorOLiary 2. Let H, and H, be as in Lemma EEZ and let |H,| > |H,|. Then 
G,2 contains a path x which alternates between H, and H, has end edges in H, 
and end vertices v, and v, satisfying one of the following conditions: 

(1) A,.~,) 21 fori=1,2 ifv, #y.. 
(2) A,.(v;) 22 ify =v. 

Proof. By Corollary 1 we get a, —a, = |H,| — |H,,| =1and hence a, 21. 

This assures the existence of an odd length path of type 2 or 3 with end edges in H,. 


6. The theorem of Berge-Norman-Rabin. We can now give a simple proof of the 
Berge-Norman-Rabin theorem [4, p.175] using Lemma EEZ and Corollary 2. 


THEOREM BNR. If P is a non-maximum 6-packing in graph G, then G contains 
a (P,6)-augmenting path. 


1973] ALTERNATING EULER PATHS FOR PACKINGS AND COVERS 401 


Proof. Let P* be a larger 6-packing and put H, = P* — Pand H, = P— P*. 
Applying Lemma EEZ and Corollary 2 (since |H,| > |H.|), we get a path which 
alternates edges of P and P (i.e., P-alternating), has end edges in P and end vertices 
Vv; and v, each satisfying condition (1) or (2) of Corollary 2. Since the edges in P N P* 
contribute to both terms of the expression d(v; ,P*) — d(v;,P), we see easily that 


dV; ,P*) _ d(v; »P) = d(v; »H,) _ d(v;,H2) . 


Combining this with conditions (1) and (2) of Corollary 2 and the inequality 
d(v;,P*) S o(v;), we find that 


d(v,,P) < 5(v,—-1 fori=1,2 if v, # Vv, 


d(v,,P) < 0(V;) — 2 if Y= V> . 
Hence, the path is (P,6)-augmenting. 


7. Graphs with edge dichotomies. Any graph G whose edge set E has been dicho- 
tomized (i.e., partioned into two subsets) can be decomposed by Lemma EEZ into 
a family of edge-disjoint alternating Euler paths with fairly natural conditions on 
the end vertices. If E = E, UE, is the dichotomy, let H; for i = 1,2 be the subgraph 
of G generated by the edge set E;, and apply Lemma EEZ (in this case G,;, = G). 
We separated Lemma EEZ from the proof of Theorem BNR because the alternating 
path decomposition is a general phenomenon not dependent on packings or covers 
or local degree constraints. The following corollaries strengthen Lemma EEZ 
somewhat: 


CoROLLARY 3. If graph G,, in Lemma EEZ is connected, then it can be decom- 
posed so that there is, at most, one path of type 1, and that only if it is the sole path 
covering all of G,,. 


Proof. Because G,, is connected, the even-length closed (type 1) paths can be 
‘*spliced’’ together or into other paths of types 2 or 3. If at least one path of type 
2 or 3 exists, then all the paths of type 1 can be made to disappear into one or more 
of the paths of types 2 or 3. The splicing is similar to that used in section 2 for Euler 
paths. Clearly, at least one path is required, so a single type 1 path is possible. 

To characterize further the alternating path decompositions, we need some 
additional terminology. Let a) be the number of even-length paths of type3 and 
let x; for i = 1 and 2 be, as before, the number of odd-length paths of types 2 and 3 
with more H, edges. It is then obvious that the number of paths of type 2 or 3 is 
exactly (a +o, +42). We call A7= D|A,,(v)| the total vertex imbalance for 
dichotomy (H,,H,). 


CoROLLARY 4. The path decomposition of a connected G,, as presented in 
Corollary 3, is minimum in the sense that no other representation of G,, as a family 
of alternating paths has fewer paths. Furthermore, the number of paths of type 2 or 


402 Cc. T. ZAHN [April 


3 is related directly to the vertex differentials A,,(v) by 
Up ta, +a, =4 > |Ai2(v)| = A7/2. 


veGi2 

Proof. First we show that A’/2 is a lower bound for the number of paths in an 
alternating family. Let F be a family of alternating paths for G,, and considera 
single vertex v with differential A,,(v) > 0. The edges of H, and H, incident to v can 
be paired offexcept for exactly |Ay 2 (v)| extra H, edges. Each pair of edges corresponds 
to the occurrence of v as an internal vertex of an alternating path of #. Each extra 
edge must represent the occurrence of v as an end vertex. An identical argument 
holds for A,,(v) < 0. The paths of any alternating path decomposition must hence 
account for at least A? end vertex occurrences, but each path can handle, at most 
2 so A! /2 is indeed a lower bound. In the proof of Lemma EEZ, paths of type 2 or 3 
are constructed only between end vertices which are currently unbalanced and each 
path deletion decreases the total vertex imbalance by 2 units. The construction of type 
2 and type 3 paths terminates when the total vertex imbalance is reduced to zero so 
the total number of such paths is precisely A’ /2. This establishes the formula and 
the minimality follows because our particular decomposition achievesthe lower bound. 

Let us call AX = A,,(v) the net vertex imbalance. It is then tempting to ask 
if a decomposition of G,, into alternating paths can be accomplished with 
a +a, = |A*|/2, it being clear that a, +, 2 |A*|/2. If equality does hold, 
then either «, = A*/2 while «, = 0, or else x, = — A* /2 while a, = 0. Figure 2 
depicts a simple dichotomized graph with A*= 0 which requires a, + a, 2 2. 


\ . + H, 
wa _ 
a * Ha - 77> 
/ 
z 
_é + ae’ 


Fic. 2 


On the other hand, this seems to result from the lack of connectedness between 
positive and negative vertices so equality may be achievable under some sort of 
multiple connectedness assumption. In any case, it would be interesting to find a 
decomposition with minimum value of «, + «, and know how the minimum relates 
to the structure of graph G,,. 


8. Acknowledgements. [ am deeply indebted to former colleagues at the National Bureau of 
Standards — J. Edmonds, A. J. Goldman, and C. Witzgall. In particular, the ideas and style of J. 
Edmonds in the area of combinatorial graph theory have been a great influence on my approach to 


1973] MATHEMATICAL NOTES 403 


such problems. The work reported here was begun while the author was at the National Bureau of 
Standards and was partially supported by the Army Research Office (Durham). 

More recent support has come from the National Science Foundation and the Stanford Linear 
Accelerator Center (funded by the Atomic Energy Commission). 


References 


1. C. Berge, Two theorems in graph theory, Proc. Nat. Acad. Sci. U. S. A., 43 (1957) 842-844, 
2. R. Z. Norman and M. O. Rabin, An algorithm for a minimum cover of a graph, Proc. Amer. 
Math. Soc., 10(1959) 315-319. 


3. J. Edmonds, Covers and packings in a family of sets, Bull. Amer. Math. Soc., 68 (1962) 
494-499, 


4. C. Berge, The Theory of Graphs and its Applications, Methuen, London, 1962. 


5. A. J. Goldman, Optimal matchings and degree-constrained subgraphs, J. Res. Nat. Bur. Stds., 
68B (1964) 27-29. 

6. J. Edmonds, Paths, trees, and flowers, Canad. Math. J., 17 (1965) 449-467. 

7. L. Euler, The seven bridges of Konigsberg, The World of Mathematics, J. R. Newman ed., 
Simon and Schuster, New York, 1956, pp. 573-580. 

8. C. Witzgall and C. T. Zahn Jr., Modification of Edmonds’ maximum matching algorithm, J. 
Res. Nat. Bur. Stds., 69B (1965) 91-98. 

9, J. Edmonds, Maximum matching and a polyhedron with 0, 1-vertices, J. Res. Nat. Bur. Stds., 
69B (1965) 125-130. 

10. O. Ore, Theory of Graphs, Amer. Math. Soc., Providence, Rhode Island, 1962. 

11. J. Petersen, Die Theorie der regularen Graphen, Acta Math., 15 (1891) 193-220. 

12. W. T. Tutte, The factors of graphs, Canad. Math. J., 4 (1952) 314-328. 


MATHEMATICAL NOTES 


EDITED BY ROBERT GILMER 


The present backlog for this Department is substantial. Until further notice, new manuscripts 
cannot be accepted. This moratorium will probably continue until June 1, 1973; authors are 
requested to hold their manuscripts pending a further announcement. 


STABLE LAWS AND THE IMBEDDING OF L? SPACES 
MAREK KANTER, Tulane University, New Orleans 


1. Introduction. It was conjectured by Banach [1] that if 1S p<qS2 then 
there exists a linear isometry from L7[0,1] into L’[0,1]. The truth of this conjecture 
is now known (see [6]). 

In fact, more is known. In [3], Lemma 1 on p. 238, there is given a proof which 
essentially demonstrates that 17[0, 1] can be linearly and isometrically imbedded into 
L[0, 1] if 0 < p< q $2. However, in Theorem 2 of [3] on p. 238, when this Lemma 
is applied, p and gq are restricted to be equal to or greater than 1. Also in an article as 
recent as [7], the solution of Banach’s conjecture when 0 < p< q $2 is left as an 
open problem. 


404 MAREK KANTER [April 


We have decided that it would be useful to write an expository and reasonably 
self-contained paper that proves the truth of Banach’s conjecture for 0 < pq $2. 
Our proof is different, and more direct, than the proof in [3]. It rests completely upon 
our being able to define a stochastic integral with respect to a certain stochastic 
process. The stochastic integral that we define is an interesting object in itself, and 
one of the purposes of this paper is to introduce this stochastic integral to probabilists 
and analysts. 


2. Notations and definitions. A measure space is a triple (Q,7, 4), where Q is 
any set, # is a sigma-field of subsets of Q, and y is a nonnegative measure on Z. We 
assume from now on that all our measure spaces are complete, i.e., if C < D with 
DeE@ and wD) = 0 then Ce. 

For p > 0 we shall write P?[Q] to denote the set of all real valued # measurable 
functions with 


iste =[f lelrau] <o 


If (Q’,B’, z’) is another measure space, then a linear mapping T from L’[Q’] to 
L?[Q] is said to be an isometry if || T(f)||,=|f |, for fel’ [Q’]. 


3. Some concepts from probability. If 4 is a probability measure, i.e., if u(Q) = 1, 
then we use the symbol P instead of p, and we call (Q,4, P) a probability triple. By 
definition a random variable on Q is a real valued @ measurable function on Q, and a 
stochastic process on Q is a collection of random variables on Q. If X and Y are two 
random variables on Q then we say that X = Yas. if P{w| weEQ, X(w) 4 Y(w)} = 0. 

If X is a random variable on Q then the characteristic function ¢@(v) is defined for 
all real v by 


Px(v) = E(e**), 


where E stands for integration with respect to the measure P. (In the following we 
shall sometimes write exp (z) instead of e* to denote exponentiation.) 

If X is a random variable on Q we say that X is symmetric stable of index q 
(q > 0) if for all real v, 


ox(v) = exp(— k|0|*) 


for some k 20. (We also say that X is symmetric q-stable.) We notice as in 
[4, p. 486] that if g >2 then there is no random variable X such that ¢,(v) 
= exp (— k| v|*), because 


d? 
E(x*) = — Fux 0x(0) = 0 


for g > 2. 
We say that a collection (X,,---, X,) of random variables on Q, is independent if 


1973] MATHEMATICAL NOTES 405 


for all v,,---,v,, v real we have $7(v) = [|i @x,(vjv), where Z = Xi} v, X;. This is not 
the usual definition of independent random variables, but it is certainly equivalent 
to it, (see [4], p. 495). From this definition we conclude that if each of the above X; 
is symmetric q-stable with dy ,(v) = exp(— k,| v|%) for ié[1,---,n], then Z = 27 0, X; 
is also symmetric stable of index g and in fact 


(*) dz(v) = exp (- ( E | », |? k)| v ). 


We end this introduction with a few brief remarks about convergence in measure. 
Namely, we say that a sequence X, of random variables converges in measure to the 
random variable X if for all e > 0 we have that lim,.,, P[|X,— xi =e] =0. It is 
not hard to see that this is equivalent to having 

lim E(|X, — X|/1+|X, — X|) =0. 
Hence we conclude that if we define p(X, Y) to be E(| xX - Y| /l+ | xX — Y|) and if 
we identify random variables that are a.s. equal, then p makes the set of random 
variables on a probability space (Q, #, P) into a metric space M, and in fact p metrizes 
the notion of convergence in measure. 


4. Existence of stable processes. We fix g in the interval (0,2]. Paul Levy not 
only discovered q-stable random variables, but he also proved the existence of a 
stochastic process (X (t)|te [0,1]) on a probability space (Q,@, P) such that 

1. X(0)=0 as. 

2. For all OSt, <-:-<t,S1 
the set (X(t,41) — X(t)|1 =0,---,n — 1) is a collection of n independent random 
variables such that 


(**) z(v) = exp(—|t—s| |»| 


for all real v and t, se[0,1] with Z = X() — X(s). 

In Levy’s work Q is a specific function space, but what concerns us here is only 
the fact that (Q,&, P) is a separable measure space in the measure theoretic sense. 
(See [5], p. 168.) 

We refer the interested reader to Breiman’s book [2, Chapter 14] for an account 
of the existence of such processes. Breiman (and Levy as well) prove the existence of 
more general processes. (Namely (**) is replaced by the condition that ¢7(v) 
= ),-s)(v), Le., the functions @z depend only on | t— s|.) 


5. The main theorem. Suppose we are given (X(t) | te [0,1]}) a stochastic process 
satisfying conditions 1. and 2. above. If fis a step function, i.e., 


n 
f= 2 Ci Tee, 04641) 


406 MAREK KANTER [April 


where c,,--:,c, are real numbers, 0 St, <--- <t,,, S1, and I;,,4,,,) is the indicator 
function of the half open interval [1t,, t;,,) then we set 


| faX = Le(X(bres) — X(t). 


We call { fdX the stochastic integral of f with respect to X. 


THEOREM. There exists a linear map f— |fdX from I4[0,1] into the set of 
symmetric q-stable random variables on (Q,&, P) which agrees with the definition 
of the stochastic integral on step functions and which satisfies 


(***)  gbo(v) = exp(— (fF |)4| >| for fe L[0,1], v real, and Z= [fdx. 


Proof. (***) holds for step functions because of (*) and (**). Also it is trivial 
to see that the map f—> [fdX is linear for step functions. Now for any fe L’[0, 1] 
there exists a sequence of step functions f, such that || f — f, ||, > 0. In particular 


im [J —falle-*0 


Now we define Z,,,= {f,dX — |f,dX. Also we define @,, to be the characteristic 
function of {f,dX while ¢,,,, is to be the characteristic function of Z,,,,. 
By Breiman [2, p. 171] we have that for any e>0 


1/e 
PL| Znm| =e] S koe | (1 — b,,»(0))dv, 
0 


where ky is a positive constant. If we use (***) to substitute for @,, ,, in this inequality 
we conclude that lim, m+ P[|Znm| 2] = 0. 

This implies that [f,dX is a Cauchy sequence in the metric of convergence in 
measure. Now it follows from Halmos [5, Theorem E, p. 93] that there exists a 
random variable which is unique up to sets of measure zero (i.e., it defines an unique 
element of M) and which is the limit of [f,dX in this metric. We call this random 
variable {fdX. Furthermore by Lebesgue’s bounded convergence theorem (see 
Halmos [5], Theorem D, p. 110 and note that it works equally well for complex 
valued functions) we conclude that if 67 is the characteristic function of [fdX then 
(v) = lim,» @,(v), and from this we conclude that (***) is valid for fe L7[0, 1]. 

To finish we prove the linearity of the mapping f— {fdX. This mapping is 
clearly linear on step functions. Now let a, b be two real numbers. Then 


| (af + bg)dX =a | fdX +b | gdX as. 


upon taking limits along appropriate sequences of step functions f, and g,. This 
completes the proof of the main theorem. 

Coro.iary. If 0 < p <q $2 then there exists a linear isometry from L{[0,1] 
to L?[0,1]. 


1973] MATHEMATICAL NOTES 407 


Proof. We first show that if X is symmetric stable of index g then E(| xX |?) <0 
if 0 < p < q. This is a problem in Feller [4, p. 215]. We notice as he does that from 
the symmetrization inequalities in [4, p. 147, 148] we can get 


4(1 — exp { — nP[|X| 2 n'"t]}) S$ P[|X| 24] 


for any natural number n and any positive number t. If we set t= n*® with e > 0, 
we conclude nP[|X | >n'/4+*] is bounded. This easily implies that n'+ pl X | 
> n\*9/4*2) is also bounded. If we choose ¢ so small that p < {(1 + 2)/q + e}-! 
then we conclude that XL? P[| X |? =n] < oo, and hence that E(| X |?) < ©. 

Let us write C,, = E(|Xo|”)'’”, where $y,(v) =e !". Then if dy(v) = e*"!* it 
follows that k'"C, , = (E(| Y|?))’”? since Y is distributed like (k)'/4X. 

Now let (X(1)|t €[0, 1]) be a stochastic process on (Q,4,P) which satisfies 
Conditions 1 and 2 above. By the theorem there exists a linear mapping f> [fdX 
from L[0,1] into the set of symmetric q-stable random variables on (Q, &%, P). 
Furthermore we have just verified that the mapping f> cP {fdX is a linear 
isometry from L{[0,1] into P[Q]. 

To finish the proof of the corollary we must map L)[Q] linearly and isometrically 
into L?[0,1]. For those who are willing to believe, I can simply say that we could 
have taken (Q,%, P) to be the unit interval with Lebesgue measure. However, those 
readers who have referred to Breiman for the existence of the process X(t) have in 
mind the specific separable measure space (Q, &, P) constructed there (see [2, p. 306]). 
These readers must now check that (Q,&, P) is nonatomic as well as separable, and 
then apply Theorem C of [4, p. 173] to conclude that the measure algebra of (Q, &, P) 
is isomorphic to the measure algebra of the unit interval, which of course implies 
that I?[Q] is linearly isometric with L?[0,1]. (Another route would be, instead of 
checking that (Q, #, P) is nonatomic, to generalize Theorem C of [4, p. 173] so that 
the hypothesis of nonatomicity is dropped. In this case one no longer gets a measure 
algebra isomorphism into the measure algebra of the unit interval, but rather just a 
homomorphism. This implies that L?]}Q] can be mapped linearly and isometrically 
into L?[0,1].) 


References 


ee 


anach, Théorie des Opérations Linéaires, Warsaw, 1932. 
reiman, Probability, Addison-Wesley, Reading, Mass. 1968. 
Bretagnolle, D. Dacunha-Castelle, and J. L. Krivine, Lois stables et espaces L?, Ann. 
Inst. H. Poincaré Sect. B, 2 (1966) 231-259. 

4. W. Feller, An Introduction to Probability Theory and its Applications, vol. 2, Wiley, New 
York, 1966. 

5, P. Halmos, Measure Theory, Van Nostrand, New York, 1950. 

6. J. Lindenstrauss and A. Pelczynski, Absolutely summing operations in L,, spaces and their 
applications, Studia Math., 29 (1968) 275-326. 

7. W. J. Stiles, On properties of subspaces of Ls 0 < p< 1, Trans. Amer. Math. Soc., 149 
(1970) 405-416. 


wo 


1. S. 
2. L. 
3. J. 


408 M. H. MOORE [April 
A CONVEX MATRIX FUNCTION 
M. H. Moore, University of Florida 


Let P,, be the class of all (strictly) positive definite symmetric n x n matrices, 
and let N, be the class of all non-negative definite symmetric n xn matrices. 
Thus P, <N,,. 

We use the standard notation ‘‘A = B’’ to indicate that A—BeEN,, and 
similarly ‘‘A > B’’ means that A — BeP,,. 

Our objective is to state and prove the following theorem: 


THEOREM. Let A and B belong to P,,, and let0 SAS 1. Then 
[AA + (1 —A)B]-' SAA7*> +1 —A)B?. 


With the obvious definitions, the theorem may be interpreted as saying that the 
matrix inverse function is convex on P,,. 

The theorem complements a recent result of Ky Fan [4] who has obtained the 
convexity property of the theorem for the so-called ‘‘M matrices’’ of Ostrowski. 

Proof. Since A and B are symmetric and positive definite, the matrices are 
simultaneously diagonalizable. More precisely, the following well-known (see, 
for example, [5], pp. 310) result holds: 


LEMMA. Let A and B be symmetric n xn matrices with AéEP,. Then there 
exists a (real) non-singular matrix Q such that 


A=QQ0", B=QDQ', 


where D is a diagonal matrix whose diagonal elements 1, are real and are the 
solutions to det(B — 1A) = 0. If, moreover, B is non-negative (positive) definite, 
then the A; are non-negative (positive). 


Applying this result in our case, we have A = QQ", B = QDQ’, and 
1A +(1 —/)B = O[AI +(1 —A)D]O? = OE,O", 
where Q is a non-singular matrix and D = diag (/,,---,4,), (A; > 0 forl Sign), and 
E, =AI+(1—-A)D. 


Notice that D and E, are non-singular, all diagonal elements of each matrix being 
positive. Put P = Q-', then 


Av} — (0Q")-' _ (Q")-*Q-? — (O-")'Q-! — P'P, 
B-! = PTD-'P, and [AA +(1 —A)B]7! = PTE, 'P. 


We now find that 


1973] MATHEMATICAL NOTES 409 
AA-* + (1 — 4)B-1—- [AA + (1 —A)B]-! = APTP + (1 — 1)PTD-'P — PTET 'P 
= P'TAI+(1—A)D-!— Ez ']P 
= P'R,P, 
where R, = AI + (1 — 4)D7' — E, ‘isa diagonal matrix. But the iith element of R, is 


1 


1 
(Ryn = A+ DT Ty GbE 


and, since the /; are all positive, the convexity of the function ¢(x) = 1/x for x >0 
shows that each (R,),; 1s non-negative (in fact, positive unless 4; = 1 or, trivially, 
4 =0 or 1). Hence R,EN,. From this, it is clear that 


0 SP'R,P = 1A-* + (1 — A)Bo* — [14 + (1 - 2B)“ 


which is the desired result. 

The matter of convexity for functions of matrices has been introduced and 
discussed by Krauss [6] in 1936. More recent results have been obtained by Bendat 
and Sherman [2] and Chandler Davis [3]. The related idea of monotone functions 
of matrices, which has found wide application in recent years, was brought forward 
by Lowner [7] in 1934. Bellman [1] page 111, gives an account of other literature 
in this intriguing area. 


References 


1. R. Bellman, Introduction to Matrix Analysis (2nd ed.), McGraw-Hill, New York, 1970. 

2. J. Bendat and S. Sherman, Monotone and convex operator functions, Trans. Amer. Math. 
Soc., 79 (1955) 58-71. 

3. C. Davis, Notions generalizing convexity for functions defined on spaces of matrices, Proc. 
Symposia Pure Math., vol. 7: Convexity, American Mathematical Society, (1963) 187-201. 

4. Ky Fan, Inequalities for the sum of two M-matrices, Inequalities, Oved Shisha ed., Academic 
Press, New York, 1967, pp. 105-117. 

5. F. R. Gantmacher, Matrix Theory, vol. 1, Chelsea, New York, 1959. 

6. F. Krauss, Uber konvexe Matrixfunktionen, Math. Z., 41 (1936) 18-42. 

7, K. Lowner, Uber monotone Matrixfunktionen, Math. Z., 38 (1934) 177-216. 


SOLUTION OF FEJES TOTH’S ILLUMINATION PROBLEM 
B. R. HENry, Syracuse University 


In this note we complete the disproof of Fejes Téth’s illumination conjecture 
({1] 1970) using the illumination function of A. Heppes (Guy and Klee 1971, [2] 
p. 1118): f(x) equals 1, 4(m+1)—x, or 0, according as OS x <4(m-— 1), 
4(m—1) S$ x <4(m+t+1), or Hm+1) Sx. 

Let m be transcendental and suppose a lamp L, stands at the origin. If illumi- 


410 J. C, KIEFFER [April 


nation is to be uniform a lamp L, must stand at m or at —1 in order to cancel the 
kink in f at 4(m—1). Let the position of L, be €,. To cancel the corresponding 
kink in L, a lamp L, must be placed at €, + m or at €, — 1; call its position €,. 
Continuing, an infinite sequence 0,¢,,¢,,¢3,--- of lamp coordinates is generated 
(by irrationality of m) that are required if illumination is to be constant. 

No two of the €’s differ by a multiple of m/(m + 1). For if k is an integer and 


km 
ci cy ™ m + 1’ 
then there are integers a, b such that 
am +b= km 
m+’ 


so am(m + 1)+ b(m+1)—km =0 or am*+(a+b—k)m+b = 0. By trans- 
cendentality of ms a=b=(a+b—k)=0+k=0. 

If lamps are erected in a finite union of congruent point lattices at average density 
(m + 1)/m then there are for some k, k lattices of span km/(m +1) In any subset 
of more than k coordinates, some two coordinates must differ by a multiple of 
km/(m + 1). By the remarks above no such arrangement gives uniform illumination. 


Research supported by NSF grant GP-31379. 


References 


1. L. Fejes Toth, A problem of illumination, this MoNTHLY, 77 (1970) 869-870. 
2. R. Guy and V. Klee, Monthly research problems, 1969-71, this MONTHLY, 78 (1971) 1113-1122. 


A COVERING THEOREM 
J.C. KieFFerR, University of Missouri-Rolla 


Let G be a nonempty bounded open set in E”, Euclidean m-space. It is not 
possible to find countably many open balls B;, each contained in G, such that 
G = limsupB;, and 2% ;m(B;) < oo. (Here, m denotes Lebesgue measure.) This 
follows from the well-known result from measure theory that if 2 ;m(B;) < ©, 
then m(limsup B;) = 0. It is possible, however, to find balls B; in G such that 
G = limsupB;, and 2; m(B;)? < 0, for every p>1. 


Proof. B(x,r) denotes the open ball with center x and radius r. For each positive 
integer n, select all balls of form B(x/n,r,,) such that: 


(1) B (=.".) <G; 


(2) x is a lattice point 


1973] MATHEMATICAL NOTES 411 


(that is, its coordinates are all integers); and 


(3) r, = _v , 
nit (in) 


For each n, the number of balls selected is O(n”), since G is bounded. The total 
collection of balls selected, as n ranges over all positive integers, gives us a countable 
collection of balls B;. For an appropriate constant C, we have 


Xu mB)? SC Ln’, p>. 
l n 


Substituting in the value for r,, we see that X,m(B,)’ < o. 

To conclude the proof, we show that G = limsup B;. Suppose then that y eG. 
For sufficiently large n, we have B(y, 2r,) <c G. We use now a theorem which allows 
us to approximate points in E™ by points with rational coordinates in a certain way 
(see [1], Theorem 4.6). This theorem states that for infinitely many n, it is possible 
to find a lattice point x, such that ye B(x,/n,r,). For infinitely many n, then, 
B(y, 2r,) < G and ye B(x,/n,r,) hold simultaneously. For each such n, a simple 
application of the triangle inequality shows that B(x,/n,r,) < G. From the way 
in which the B; were defined we conclude that ye¢ B, for infinitely many i. 


Reference 


1. Ivan Niven, Irrational Numbers, MAA Carus Mathematical Monograph 11, (1956) 47. 


DISTRIBUTIVITY OVER THE DIRICHLET PRODUCT AND COMPLETELY 
MULTIPLICATIVE ARITHMETICAL FUNCTIONS 


Eric LANGForD, University of Maine 


Recently in this MONTHLY, some interest has been expressed in the relationship 
between an arithmetical function’s being completely multiplicative and its being 
distributive over certain Dirichlet products. Lambek [3] proved (Theorem 1) that 
the arithmetical function f is completely multiplicative if and only if it distributes 
over every Dirichlet product. Problems of Carlitz [2] and Sivaramakrishnan [4] have 
shown that f is necessarily completely multiplicative if and only if it distributes over 
certain particular Dirichlet products. Apostol [1] also gives various conditions 
involving the Dirichlet product that guarantee that f is completely multiplicative. 

In this paper, we shall give sufficient conditions on a particular Dirichlet product 
in order that every arithmetical function which distributes over that product must 
necessarily be completely multiplicative. These conditions will be general enough so 
that the results of [1], [2], [3], and [4] in this area will follow as corollaries. 

We shall say that f distributes over the Dirichlet product gxh=kiffg*#fh=fk. 


412 ERIC LANGFORD [April 


It is immediate that if f is completely multiplicative, then f distributes over every 
Dirichlet product and hence over any particular such product. But every function f, 
multiplicative or not, which satisfies f(1) = 1 distributes over the product 6*6 = 6, 
where 6(n)=1 if nm =1 and O otherwise. The problem is to find those Dirichlet 
products which are sufficiently ‘“‘sensitive’’ so that only completely multiplicative 
functions will distribute over them. Carlitz’s problem shows that 1*1=7 is sucha 
product, where t(n) denotes the number of divisors of n, and where 1(n) = 1 for all n. 
Sivaramakrishnan’s problem essentially shows that @*1=TJ is such a product, 
where I(n) = n is the identity function and where @ is Euler’s totient function. 
If k = gh is any Dirichlet product, we notice that 


(1) k(n) = g(A)h(n) + g(nyh() 
whenever n is prime. If (1) holds only when n is prime, we say that the product 
k =g*h is discriminative. 

THEOREM |. Suppose that f(1) #0. Then f is completely multiplicative if and 


only if it distributes over some discriminative product k = g «h. 


Proof. As already remarked, if f is completely multiplicative, then it must dis- 
tribute over every Dirichlet product. Assume then that f(1) 4 0 and that f distributes 
over the discriminative product k =g«h. We show first that necessarily f(1) = 1. 
Since k = g «h is discriminative, it follows that k(1) 4 0 for otherwise (1) would hold 
with n= 1. But 


FUDKA) = FRO) = fo Fh) = fC) gQh@) = fC)7k(), 


and since f(1)k(1) 4 0, it follows that f(1) = 1. 
We now show by complete induction on m that if p,, p>,---, D, is any choice of 
primes (distinct or not), then 


(2) F(P1*** Pm) = FP) + fPm)- 


If m = 1, the proposition is trivial, so assume that m = 2, and write n = p,-:: D,,. 
Using the distributive property and the inductive assumption, we see that 


L£(Pi ++ Pm) — f(D) f(Pm)] &’g(dA)h(n /d) = 0, 


since the terms involving f(1) cancel; the prime on the sum indicates that it is to be 
taken over all divisors d of n other than 1 and n. But 


L'g(d)h(n /d) = k(n) — g)h(n) — g(m)h(), 


and this is non-zero since k = g«h is discriminative. Therefore (2) holds and the 
proof is complete. 
A Dirichlet product k = g xh will be called partially discriminative if for every 


1973] MATHEMATICAL NOTES 413 


prime power p' (with i= 1), the equation 


k(p,) = g)h(p;) + 9(pph) 
implies that i = 1. 


THEOREM 2. Suppose that f is multiplicative. Then f is completely multiplicative 
if and only if f distributes over some partially discriminative product k = g*h. 


Proof. If f(1) = 0, then it is easily shown that f(n) = 0 for all n. If f(1) 4 0, then 
f(1) = 1 since f is multiplicative and the proof follows by showing that f(p”) = f(p)” 
for every prime power p™, using complete induction on m. 

Theorem 1 still leaves the case of f(1) = 0 unresolved. In general, nothing can be 
inferred about f in this case. For example, if we let f= 1 — | | , where p is the Mobius 
function, then f distributes over the discriminative product w«1 = 6, even though f 
is not even multiplicative. Something, however, can be salvaged: it is not hard to 
show that if f(1) = 0 and if f distributes over some (not necessarily discriminative) 
product k = g «h, where k(n) never vanishes, then f must be completely multiplicative; 
ie., f(n) = 0 for all n. 

We show now how the results in [1], [2], [3], and [4] are corollaries of these 
two theorems. 


CoROLLARY 1. Sivaramakrishnan’s Problem E 2196 [4] asks us to show that 
fl«f-!=f¢ if and only if f is completely multiplicative. This is equivalent to f 
being distributive over @*1=I, and Theorem 1 applies. Sivaramakrishnan’s 
assumption that f be multiplicative is superfluous. 


COROLLARY 2. Carlitz’s Problem E 2268 [2] asks us to show that f is completely 
multiplicative if and only if it distributes over 1*1=+t. Theorem 1 again applies. 
This also shows Lambek’s Theorem 1 [3]. 


CoROLLARY 3. Apostol’s Theorem 2 [1] claims that if f is multiplicative (and 
not identically zero), then f~' =f if and only if f is completely multiplicative. 
But f~! = fu if and only if f distributes over ux 1 = 6, and Theorem 2 applies. 


COROLLARY 4. Apostol’s Theorem 8 generalizes Sivaramakrishnan’s result as 
follows: Suppose that G is a completely multiplicative function and that g = Gx wu. 
Suppose further if p is a prime, then G(p') = 1 only if i=0. If f is multiplicative, 
and if fG«f~ + = fg, then f is completely multiplicative. (Remark: Apostol assumes 
that G(p) # 1 for every prime p, but in his proof he uses the fact that G(p') = G(p)' # 1 
if i 2 1. Since we allow complex values, these are not equivalent assumptions.) 


We need not assume even that G is multiplicative, but only that G(p’) = 1 if and 
only ifi = 0. Now fGsf~! = fg is equivalent to f being distributive over G = (Gxp)*1 
and the assumption that G(p') = 1 if and only if i = 0 allows Theorem 2 to apply. 


I would like to thank the referee for clarifying the statements of Theorems 1 and 2. 


414 R. W. SIELAFF [April 


References 


1. T. M. Apostol, Some properties of completely multiplicative arithmetical functions, this 
MONTHLY, 78 (1971) 266-271. 

2. Leonard Carlitz, Problem E 2268, this MONTHLY, 78 (1971) 1140. 

3. J. Lambek, Arithmetical functions and distributivity, this MONTHLY, 73 (1966) 969-973. 

4. R. Sivaramakrishnan, Problem E 2196, this MonTHLYy, 77 (1970) 772. 


PERFECT PARALLELOGRAMS 
R. W. SIELAFF, Naperville, Illinois 


M. V. Subbarao [1] showed that the number of triangles (a,b,c), whose integer- 
valued sides a,b,c add up to J times their area, is finite for all positive 4. In fact he 
showed that with the exception of the triangle (2,2,2) the number of such triangles 
(called Perfect Triangles) is zero for 4 > ,/8. [Editor’s note: In the latter part of 
Subbarao’s proof, there is a mistake which has caused the triangles (1, b, b) to be 
dropped from contention. For each of these, 2 >./8.] He also suggested that it 
would be interesting to consider a similar problem for a quadrilateral. This note 
will consider a similar problem for a parallelogram. 

Let a,b,c be positive integers greater than zero. If D is the area of a parallelogram 
with adjacent sides b and c and included angle A, then D = be sin A. 


DEFINITION. A Perfect Parallelogram (b,c) is a parallelogram such that 2b + 2c 
= aD. From the above, abc sin A = 2b + 2c or 


sind = = (5 +2). 
a\b e¢ 


Since sin A <1, 1/b+1/e S$ a/2. 
The following solutions are possible: 


a=1,b=3,c26; a=1,b=4c 


IV 
UW - 


lb=5,c24; a=1,b26,c2 


a 
a=2,b22,c22 
a=3,b=1,c2=2; a 


1. 


I 
we 
Co 
IV 
N 
Lon) 

I 
rr 


a24,b2 l,c 


IV 


From the above discussion it is clear that there are infinitely many perfect 


parallelograms. 
For the special case of the rectangle, sin A = 1 and c = b+k, where k is an 


integer, k = 0: 
2 


a74 
Ob b+ 


1973] MATHEMATICAL NOTES 415 


For b > 4 there are no perfect rectangles since a < 1 for any k. For b = 1,2,3,4, 
there are five perfect rectangles: 


(1,1), (1,2), (2,2), (3,6), (4,4). 
For the special case, sin A = 4 andc=b+k 


4 4 
a=—+— 


b b+k 
For b > 8 there are no such perfect parallelograms since a < 1 for any k. 


For b = 7 there is no solution. For b = 1,2,3,4,5,6,8, there are ten such perfect 
parallelograms: 


(1,1), (1,2), (1,4), (2,2), (2,4), (3,6), (4,4), (5,20), (6,12), (8,8). 


I would like to acknowledge with thanks the suggestions of the referee. 


Reference 


1. M. V. Subbarao, Perfect Triangles, this MONTHLY, 78 (1971) 384-385. 


A CROWDED SET OF NON-INTERSECTING LINES 
J. A. Etpswick, University of Nebraska 


THEOREM. There exists a family of lines, uncountably many in each of an 
uncountable number of directions, which has no intersection points on the strip 


S= {(x,y:0SxS1, -w<y< ob}, 


Proof. Define h on the Cantor ternary set C by 


h( »» a,3"") = La,9" 
n=1 n=1 
and extend h linearly to all of the interval [0,1]. Obviously, there are uncountably 
many values each of which is attained by h uncountably many times. Also it is 
easy to show that the function h(t) + t is increasing on C and therefore on [0,1]. 
(For additional properties of h and the existence of smoother “‘uncountably recurrent 
functions’’ see [1].) 

Now for each te[0,1], let L(t) be the line defined by y = h(t)x +t, and let 
F ={U(t):0stsl}. 


Reference 


1. R. B. Darst, C®”-functions need not be bimeasurable, Proc. Amer. Math. Soc., 27 (1971) 
128-132. 


RESEARCH PROBLEMS 
EDITED BY RICHARD Guy 


In this Department the Monthly presents easily stated research problems dealing with notions 
ordinarily encountered in undergraduate mathematics. Each problem should be accompanied by 
relevant references (if any are known to the author) and by a brief description of known partial 
results. Manuscripts should be sent to Richard Guy, Department of Mathematics, Statistics, 
and Computing Science, The University of Calgary, Calgary 44, Alberta, Canada. 


A DECEPTION GAME 
JOEL SPENCER, University of California, Los Angeles 


The following game, and problem, are due to Mark Thompson. The problem 
has been worked on by a number of mathematicians with little success. 

The Deception Game is a two person zero-sum game. The first move is a chance 
move. An ordered triple of numbers x,,x,,x3 is Chosen independently from a uni- 
form distribution on [0,1]. The second move is the ‘‘deception move’’. Player II 
looks at x,,xX2,x3 and changes one of the x’s to some number (possibly the same). 
More formally, he picks y,, y2, y3 so that x;= y, for at least two of the values i=1, 2, 3. 
The third, and final, move is the ‘‘guessing move’’. Player I looks at y,, y2, y3 and 
picks i = 1,2, or 3. The payoff to Player I is x;. 

Let w be the value of the game (if it exists). Player I may assure himself 4 by the 
strategy ‘‘Choose i = 1°’. (Or any other strategy that does not involve looking at 
V1» V2» V3 ) Thus w = 1/2. 


BIG QUESTION: Does w = 1/2? 


That is, can Player II so bamboozle his opponent that Player I can do no better 
than guess without looking at y,y,y3? 

If w = 1/2 there is a strategy for Player II that holds Player I to 1/2. (Or possibly, 
for e > 0, a strategy that holds Player I to 1/2 + €.) A necessary and sufficient con- 
dition on such a Strategy is that when Player I sees y,, y, y3 he must have preference. 
That is E[x;|¥1,y2,y3] is independent of i. 

We note that the computation of the payoff given strategies for Players I and II 
often involves subtle probabilistic considerations. 

The reader might find it instructive to show that the following strategies for 
Player If do not hold Player I to w = 1/2. We assume the x’s satisfy x;, < x;, < X;,. 

(1) Change x;, to x;, 

(2) Change x;, to y,, picked from uniform distribution on [0,1] 

(3) Change x,, to y,, picked from uniform distribution on [0,x;,] U [%;,, 1]. 

As this is an infinite game it is not clear, and has not been proved, that a value 
w exists. The usual compactness arguments on the strategy space do not seem to 
work although one certainly feels, intuitively, that w does exist. 


416 


CLASSROOM NOTES 417 


Known results. (All results on this problem are unpublished. The results of Mark 
Thompson are part of his unpublished undergraduate thesis— Harvard University, 
1970.) 

Thompson considered a discrete version when the x; are chosen uniformly from 


(o. I 2 oe | and proved w = 1/2 forl Sn<5., 
non n 


It seems reasonable to generalize and allow x,,x,,x3 to come from any (but 
the same) distribution of finite mean. Then the conjecture would be that the value 
w is that mean. This writer, for example, showed the conjecture true if the x, are 
chosen uniformly from {0,1,8}. 

Some work has been done on the generalization where n numbers x,,---,x, 
are chosen uniformly in [0,1] and k of them are changed by Player II. E. B. Keeler 
proved that ifn < 2k, w = 1/2. D. Kleitman and, independently, S. Zamir, proved 
thatifn = 4,k =1 thenw>1/2. 


CLASSROOM NOTES 
EDITED BY ROBERT GILMER 


Manuscripts for this Department should be sent to Robert Gilmer, Department of Mathematics, 
Florida State University, Tallahassee, FL 32306. Notes are usually limited to three printed pages. 


TRAFFIC FLOW: LAPLACE TRANSFORMS 
E. A. BENDER AND L. P. Neuwirtnu, Institute for Defense Analyses 


Preface. We have found that students seem to develop a better appreciation of 
differential equations when presented with substantial applications. The following 
project is a modification of a handout we have used in our classes. Each question 
is preceded by “*‘Q’’. Selected answers appear at the end. Section 1V is written at 
an elementary, heuristic level, because our classes were studying differential equa- 
tions as part of the basic calculus sequence. The handout referred to in VI applied 
stability theory to the Volterra-Lotka equations for predator-prey models and to 
a pendulum with a frictional force which is an arbitrary function of velocity. 


1. Introduction. The mathematical study of traffic flow is relatively new. It 
requires little scientific background and uses primarily differential equations, 
probability and statistics. A book [1] is available; however, you may find it difficult 
if you have not had some probability and statistics. There are also brief surveys 
in [2, 3]. 

We shall consider the following problem: 


418 E. A. BENDER AND L. P. NEUWIRTH [April 


A single line of cars moves along a straight highway without passing. 
Under what conditions will acceleration and/or deceleration by the first 
driver cause a collision further back? 
It is clear that very rapid action by the first driver can easily cause a pile up. We 
are interested in situations in which moderate action by the first diiver causes more 
and more violent responses as the effect travels back along the line of cars. 


2. The model. This model is taken from [4]. The position of the lead car is given 
by x,(t), the position of the nth by x,(t). All drivers will be treated as identical. 
(This is not essential. It only simplifies calculations.) Time is measured in units of 
driver reaction time. Each driver’s acceleration (deceleration) is proportional to the 
difference between the speed of his car and that of the car ahead. The first driver 
is free to do as he wishes. Thus we assume 


(1) Xn(t) = C(x,-1(t—1) — x,(t—1)) 


for some C > 0 and all n> 1. The t — 1 is due to reaction time lag. For t < 0 we 
assume that x,(t)is a constant independent of n; that is, the string of cars is moving, 
as a unit with constant velocity. 


Qi: Comment on the reasonableness of (1). In particular, might C depend on car 
separation? In a qualitative way, how? Might it differ for acceleration and 


deceleration? How? 
Q2: Introduce z,(t) = x,(t) — x,(0)— tx,(0). Interpret z, and show that 
z(t) = C(z,_,(t—1) —z,(t—1)) for n>1, 
2) a = z(t) = 0 fort <0 and n21l. 
(The fact that z,(0) = z,(0) = 0 simplifies the next step.) 


3. Some Laplace transforms. 


Q3: Assume that the lead car varies its speed in some fashion when t >0. 
Take Laplace transforms and show that 


(3) Zn+i(8) = CC +s") "Z,(s), 
where Z,(s) = #(z,(t)). 
Q4: Let a,(t) = zi(t) = x,(t), the acceleration of the nth car. Denote ¥(a,(t)) 
by A,(s). Using (3) deduce 
(4) Ans1(s) = CC + se’) "A, (3). 


(You should recognize that this formula expresses the Laplace transformed 
description of the n + 1 car’s behaviour in terms of the first car’s behaviour!) 
When 4A,(s) is specified, these transforms can be inverted, (with work), but this 


1973] CLASSROOM NOTES 419 


approach leads to something messy and hard to use. Instead, one may rely on a 
bit of the theory of complez variables to obtain some simple approximate results. 


4. Approximate inversion of Laplace transforms. We want to know roughly how 
the inverse transform of (4) behaves. In this section, we discuss without proof a 
well-known fact in the theory of Laplace transforms. 

The transform #(e") = (s—a)~* was derived in class for s a real number 
greater than a. However, (s — a)~* makes sense for all complex numbers s ¥ a. 
All Laplace transforms can be extended to complex numbers in this fashion. 

Let g(s) = 1/¥(f). Suppose g(so) = 0 and no solution of g(s) = 0 has larger 
real part. Also suppose 


g'(So) = 9"(So) = = g(so) = 0 


and g‘K*+'(s,) % 0. One calls sy a zero of g of multiplicity K + 1. Letsy = a + bi. 

Fact: Under the above assumptions there is a constant E such that f(t) grows 
like Et*e. In addition, f(t) may oscillate. If b 4 0, then there is oscillation like 
cos(bt +d) for some d. Of course, there may be several roots of g(s) = 0 with 
real part a and different imaginary parts. Then there will be several components 
of the oscillation. 


Examples. (The facts given here are without proof, you may wish to verify them.) 
Suppose 


2 = (Gare) Gare) 


b, b’ £0. Then g(s) = ((s—a)”? + b*)((s—a)* + b’”) has roots a+ bi, at b’i. 
At each root g’(s) # 0. Hence f(t) grows like 


ce“ cos(bt + d) + c'e” cos(b’t + d') 


for some constants c, c’, d, d'. You can check this by finding f(t). If b’ = 0, then 


we get 
c'te™ | 


and the term in the box is the important one. If g(s) = ((s—a)* + b”)(s—a), then 
f(t) grows like 


ce“ cos(bt + d) + 


ce“ cos(bt + d) + c'e™. 


t 


Q5: Suppose f = e'—e ~cost. Then (complete this equation) 


1) = gq R 


and g(s) = 0 has solutions s = 


420 BE. A. BENDER AND L. P. NEUWIRTH [April 


In this case, there is some oscillation due to the term e cost, but it dies out 
faster than e”’. The graph of f is like that shown below. This relative unimportance 
of the oscillations is typical of the case in which there is only one root with largest 
real part and this root is a real number. 


5. Analysis of our model using the previous sections. Now suppose the first 
driver accelerates and decelerates for a finite period of time; that is, a,(t) > 0 for 
a bit, < 0 for a bit, and then a,(t) = 0 for t 2 T for some T. 


Q6: Use the physical behaviour of the first car to show that 1/(A,(s)) = 0 
has no solutions. 
The following can be shown concerning the zeros of C + se° which have largest 
real part equal to a. 


Condition Nature of a + bi 
C> 21/2 a>o0 b #0 
a=0 b #0 


Q7: Using the ideas developed in this handout, analyze each of the four cases 
for C. Describe the nature of a,(t), (n> 1) for large t, where the first 
driver behaves as suggested at the beginning of this section. Show that a 
collision must take place if C = 2/2. 


6. Some remarks. The above system is a “feedback mechanism’’ which can be 
characterized as follows: 


1973] CLASSROOM NOTES 421 


(i) a critical point (where z,(t) = 0 for all n), 

(ii) a “‘mechanism’’ which acts to restore the system when it deviates from the 
critical point (here the equations (2)), 

(iii) often a time delay (here reaction time). 

Similar ideas were studied in the stability theory handout where we studied 
the nature of critical points for non-linear models by linearization. A non-linear 
time delay situation can also be handled by a linearization approach, but it requires 
Laplace transforms as used here. Related ideas are presented in [6] and the references 
given there. 

Feedback delay is important for stability, as the problem below shows. Suppose 
we measure things in units of ordinary time with A the reaction time. We have 


(5) Xn+i(t) = k(x, (t—A) — x, (t—A)). 


Q8: (a) Transform (5) to the form (1) and so express C in terms of k and A. 
(b) Deduce that as reaction time increases the system tends to become 
unstable. 
(c) Now assume A =0. Thus replace t — 1 by tin (1). Show that 


Zn+1(S) = CXC +s) "Z,(5). 


Deduce that the system is stable regardless of the value of C > 0. (In fact, 
large C gives greater stability in direct opposition to the case A # 0!). 

Another sort of instability can occur in our example. A plot of z,(t) versus time 
may level off as t > oo, but as n gets larger, the graphs may become wilder. This 
can be studied in various ways: 

(i) Choose x,(t) so that (1) can be solved [5]. 

(ii) Study the inverse transform of (4) by approximate methods which are more 
accurate than those used above [4, 5]. 

(iii) Use other transforms to study other functions besides a,(t), for example 
fora, (t)at [4]. 

We shall consider (i). 

Let x,(t) = b,e'°' where b, is to be found. In the end, we can let x,(t) be the 
real part of the above since d/dtRe(/(t)) = Re(/’(t)). (Re denotes ‘‘real part of.’’) 
This gives a sinusoidal motion. 

Q9: (a) Let f(t) be an arbitrary differentiable complex valued function of the 
real number t. Show that d/dtRe f(t) = Re(/'(d). 

(b) Using (1), show that 


— w*b, = iCwe ™ (b,-,— b,). 


(c) Deduce that 


i@ iw " 
bn +1 =— 6, | (1+ Ge . 


422 E. A. BENDER AND L. P. NEUWIRTH [April 


(d) Show that we have instability if and only if 


! 1 +e" <1 


and hence when C>o@/2sina. 
(e) Conclude that if C>4, there is instability for w near zero. 

Note the difference between the result C >4 and our earlier result C>2/2=1.57---. 

Q10: Account for the difference just noted. 

Experiments indieate that C is nearly 4 for the actual drivers. See [4]. We have 
only considered one type of motion for the leader, but the ideas can be generalized 
by using Fourier series or approach (ii). Approach (iii) has another advantage. 
We can allow for the fact that (1) should be replaced by a less deterministic equation. 
This involves some elementary statistics and the Fourier transform. 


SELECTED ANSWERS 
A4: We have 
L(z,(t)) = CL P(Z,-1(t— 1) — Y@,(t- 1}. 
By tables of # 
s*Z,(s) = C(e‘sZ,- ,(s) — e” °sZ,,(s)). 
Hence Z,(s) = C(C + &) ‘Z,-4(s). 


A6: Let « = Maxg<;<7| 4, (t) | and f(s) = MaXxo<;<r|e"| . Then 


oe) T 
[, | ea, (t)| dt -{ e“a,(t)| dt 


IA 


|.A,(s)| 


A 


[stor = af(s)T. 


Hence (A,(s)) * has no roots. 

A7: When C> 7/2 acceleration is unstable: it oscillates with larger and larger 
amplitudes as time goes on. When C = n/2, there is stable oscillatory behavior 
for the second car and unstable oscillatory behavior for the rest of the string since 


d (C + se)" 
ds A,(s) s=athi 


A8: (a) Let the independent variable be t = t/A. We obtain 
Xn+i(t) = Ak(x,(t)— x,(t— 1)) 


so the preceding analysis applies with C = Ak. 
AlO: We have considered different accelerations for the first car in the two 
cases. We have also given different answers. 


= 0 for n>1. 


1973] CLASSROOM NOTES 423 


(a) In Section 5 we saw that the third car always develops wild accelera- 
tion if C>7/2. 

(b) By Q9(c) we see that no car ever develops wild acceleration, but each 
car is wilder than its predecessor if C > @/2sinq. 


References 


1. F. Haight, Mathematical Theories of Traffic Flow, Academic Press, New York, 1963. 
2. D. Gazis, Mathematical theory of automobile traffic, Science, 157 (7-21-67) 273-281. 
3. R. Herman and Gardels, Vehicular traffic flow, Scientific American, 209 no. 6 (Dec. 1963) 


35-43. 
4. R. Herman, Montroll, Potts, and Rothery, Traffic dynamics: Analysis of stability in car 


following, Operations Res., 7 (1959) 86-103. 
5. R. Chandler, Herman, and Montroll, Traffic Dynamics: Studies in car following, Operations 


Res., 6 (1958) 165-184. 
6. H. Simon, Models of Man, Chapter 13, Application of Servomechanism Theory to Produc- 


tion Control, Wiley, New York, 1957. 


IRRATIONAL NUMBERS 


J. P. JoNES AND S. TorporowskI, University of Calgary 


For the past few years a clever proof has been making the rounds of the 
various mathematics departments. 


THEOREM 1. An irrational number raised to an irrational power may be rational. 
Proof: Consider the identity 
[27]? =2. 


If ,/2%? is rational then we are finished. If not then ,/2%? is irrational so (./2%7)*? 
is the example. 

This proof seems first to have been published by Dov Jarden as a curiosity 
in [3]. The proof was published again in [2]. Note that while the proof is elemen- 
tary, it is non-constructive. The non-constructivity enters in the form of the logical 
principle of the excluded middle (tertium non datur) which the intuitionists reject. 

Actually ,/2%? is irrational, being the square root of Hilbert’s number 2%”, 
proved transcendental by Kuzmin [1] in 1930. But this result, which is not elementary, 
is not used above. Only the irrationality of ./2 is used. 

Consider next the related theorem. 


THEOREM 2. An irrational number raised to an irrational power may be irrational. 


Of course we can use set theoretical principles to prove that a? is irrational for 
almost all real numbers b. Or we can use the result of Kuzmin [1] to prove Theorem 2 
But does Theorem 2 have an elementary proof? 


424 IOANNIS PAPADIMITRIOU [April 


Proof: Consider the identity ,/2°?*” = (./2%?) ,/2. 

If ./2%? is irrational then we are finished. If not, then ./2¥? is rational. Hence 
(./2%7) ,/2 is irrational, and ./2‘¥?*”) is the example in this case. 

There is also a simple identity by means of which it can be proved that a rational 
number raised to an irrational power may be irrational. But perhaps the reader 
would enjoy finding this one himself. 


References 


1. R. Kuzmin, On a new class of transcendental numbers, Izv. Akad. Nauk SSSR, Ser. Mat., 
7 (1930) 585-597. 

2. Mathematics Magazine, 39( 1966) 111, 134. 

3. Scripta Mathematica, 19 (1953) 229. 


ce 
A SIMPLE PROOF OF THE FORMULA > k7? = 17/6 
k=1 


IOANNIS PAPADIMITRIOU, Athens, Greece 
Start with the inequality sinx <x <tanx for 0<x<1n/2, take reciprocals, 
and square each member to obtain 
cot?x < 1/x* < 1+ cot?x. 


Now put x = kz/(2m + 1) where k and m are integers, 1 S$ kK < m, and sum on 
k to obtain 


m kn (Qm+1)? 2% 1 m 
20 Mei TS a 2 
() _% cot 2n+1 < Tt & K2~™ + _% oot 2m+1- 
But since we have 
~ kn m(2m — 1) 
2 = ce es 
2) % 0 sed 


(a proof of (2) is given below) relation (1) gives us 


m2m—1) (Q2m+1) 2 1 m(2m — 1) 
a COOK ———___, 
3 my, 3 
Multiply this relation by 2?/(4m*) and let m > oo to obtain 
m 1 mm 
li = 2 
noe 2 k? 6 


Proof of (2). By equating imaginary parts in the formula 


cosn@ + isinn@ = (cos@+ isin@)" = sin"@(coté + i)” 


1973] CLASSROOM NOTES 425 
; . N\ wk nk 
= sin"? > (i) cot” “8, 
k=0 


we obtain the trigonometric identity 
sinn@ = sin"0 I; cot” *@ — (5 cot” 76 + (5 cot” °@ — + 


Take n = 2m + 1 and write this in the form 


6 


(3) sin(2m + 1)0 = sin?"*‘6P,,(cot70) with O<@ < 5 


where P,, is the polynomial of degree m given by 


2 1 2 1\ w- 2 1 _ 
Pats) = (°" 7 jen —(7™"5" )x H(t ) x Fp 


Since sin@ ¥ 0 for 0 <@ < 2/2, equation (3) shows that P,,(cot?0) = 0 if and only 
if (2m + 1)0 = kn for some integer k. Therefore P,,(x) vanishes at the m distinct 
points x, = cot? xk/(2m + 1) for k = 1,2,:--,m. These are all the zeros of P,,(x) 
and their sum is 


Sane2 EK om em _ m(2m — 1) 
Loot? = | 3 1 a 


which proves (2). 


Note. This paper was translated from a Greek manuscript and communicated to the MONTHLY 
on behalf of the author by Tom M. Apostol, California Institute of Technology. After this paper 
was written it was learned that the same proof was discovered independently and published in 
Norwegian by Finn Holme in Nordisk Matematisk Tidskrift, vol. 18 (1970), pp. 91-92. See also 
A. M. Yaglom and I. M. Yaglom, Challenging mathematical problems with elementary solutions, 
vol. II, Holden-Day, San Francisco, 1967, problem 145. 


ANOTHER ELEMENTARY PROOF OF EULER’S FORMULA FOR ((2n) 
Tom M. Aposro., California Institute of Technology 


1. Introduction. The classic formula 


2n)?"B,, 


n— 1, (27)"" Ban 
=(—) 2(2n)! 


(1) (Qn) = & op 


which expresses C(2n) as a rational multiple of 2?" was discovered by Euler [2]. 
The numbers B, are Bernoulli numbers and can be defined by the recursion formula 


By = 1, B, = z(")B for n> 2, 


s=0 


or equivalently, as the coefficients in the power series expansion 


426 T. M. APOSTOL [April 


[e 8) B n 
(2) z= aA — |z| <2z. 


In this notation we have 


1 


(3) B, = — >? Bont = 0 for n = 1, 
and 
1 1 1 1 5 
By = & By = — 36) Bs = > Bs = — 35 Bio 66° 


Euler’s original proof of (1) was obtained from two distinct representations 
of mz cot mz, a power series expansion obtainable from (2), 


- (27z)°"B>, 


mz cotnmz =1+ % (—1)" , valid for | z| <1, 
1 


= (2n)! 
and the partial fraction decomposition 
oe) 72 
mzcotmz =1—2 » ———~, valid forz £0, +1, +2,>::. 
_, k? — 2? 
k=1 


If | z | < 1, each term in the last sum can be expanded in a geometric series giving us 


00 00 2\n 00 
mzcotrz=1-2hL 2% (i3) = 1-2 2% €(2n)z?". 
k=1 n=1 n=1 

Equation (1) follows by equating coefficients of z*” in the two power series expansions 
of xzcotzz. Details justifying this argument are given in Knopp [4], pp. 203-207, 
236. 

Another well-known proof is obtained by putting s = 2n in Riemann’s functional 
equation 


(1 —s) = °2ny ° T(s) cos (5) 


and using the fact that €(1 — 2n) = — B,,/(2n). These results are deduced by applying 
residue calculus to a contour integral representation of f(s). (See Titchmarsh [8], 
pp. 18-20.) 

Several writers have given more elementary proofs of (1) that do not require 
concepts from advanced real or complex analysis. For example, Titchmarsh [7] 
obtained a set of complicated recursion formulas which can be used to evaluate 
C(4), €(6), ---, successively in terms of €(2). Estermann [1] obtained a simpler formula 
of the same type. These recursion formulas, which show that ¢(2n) is a rational 
multiple of ¢(2)", were deduced by rearranging absolutely convergent infinite series 
but did not require any function theory. Estermann also gave an elementary proof 
of the formula C(2) = 22/6 as a consequence of Gregory’s series 1—4+2— ++ = 2/4. 


1973] CLASSROOM NOTES 427 


A recursion formula simpler than those of Titchmarsh and Estermann was proved 
by G. T. Williams [11] who showed by elementary methods that 


(4) (n + 5)s2m) = = C(2k)C(2n — 2k), 

He also obtained the companion result 

(5) | (n - 5} (1 — 2-2")¢(2n) = z E(k — 1)éQn — 2k + 1), 
where 


E(s) = r (oi for s>0 
k=o (2k + 1) ) 
Note that €(1) is Gregory’s series for 7/4. Taking n = 1 in (5) we find 2¢(2) = €?(1) 
= 2*/16, so €(2) = x*/6. This result, in conjunction with (4), gives a completely 
elementary evaluation of €(2n) as a rational multiple of z*". Williams also points out 
that (4) is equivalent to the following recursion formula for Bernoulli numbers, 


n-1 on 
—(2n+1)B,,= & Bo, Bon 25 - 


This relation appears in Nielsen’s book [5] and was also discovered independently 
by R. S. Underwood [9] who used it to evaluate the sums %;'_,k" in terms of 
Bernoulli polynomials. 

The purpose of this note is to show that the elementary method used by Papa- 
dimitriou to evaluate ((2) in the foregoing paper [6] can be extended to evaluate 
¢(2n) and leads directly to Equation (1) rather than to a recursion formula. The 
interplay of ideas from elementary algebra and trigonometry makes the proof 
especially suitable for an elementary calculus course. 


2. Elementary Proof of (1). The key ingredient in Papadimitriou’s proof is 
the formula 


or rather the asymptotic relation 


m kn 2 
2. — _ 
(6) 2 cot 2m + 1 37 


which it implies. Our evaluation of ¢(2n) makes use of the following lemma which 
provides a generalization of (6). 


LEMMA 1. For any integers m = 1,n 2 1, we have 


428 T. M. APOSTOL [April 


m 4n-1 
(7) x cot7”———— kn 2 Ban 


— _ n—1 2n 2n—1 
2 me OOD aay mT + tn; 


where the constant implied by the O-symbol is independent of m. 


First we show how the lemma implies (1) and then we prove the lemma. 
The inequality sinx < x < tanx for 0 < x < 2/2 implies 


cot?” 


x)" 


for each integer n = 1. We take x = k/(2m + 1) and sum on kK to obtain 


m 2n m m n 
(8) x cot?" _ Am < (2m + VY) > < & (1 + cot? kn , 
= k=1 2m + 1 


k=1 2m +1 mn kar kK?" 
From (7) and the binomial theorem we see that 


»y (1 + cot? 


k=1 


y = = y cot?" _ xn + O(m2"— +), 


Tt 
2m +1 k=1 2m + 1 


Therefore if we multiply (8) by 22"/(2m)" and let m— co we obtain 


21)? "Bon 


lim ¥ Qt” 


m>o k=1 k2" 


= (— 17 SD 


which proves (1). 


3. Proof of Lemma 1. As in Papadimitriou’s paper we use the polynomial 


2 1 2 1 _ 2 1 _ 
P(X) =( mt Jen —( " Js + ( “ je on oe 


whose m zeros are the numbers 


Xx, = cot? 5 a, k= 1,2,---,m. 


Let s, = xj +++: + x,,. This sum appears on the left of (7) and we are to prove that 


1 24" 1B, 
(2n)! 
The proof is by induction on n. The case n = 1 was proved in Papadimitriou’s 


paper. Now we assume that (9) is true for n = 1,2,---,r — 1, and prove it forn =r, 
with the help of Newton’s formulas (see [10], p. 261) 


(9) s, =(-1)"" m2" + O(m*""'). 


r-—1 


(10) —s, =(—1)'ro,+ X(-1) “s,0,-,, r= 1,2,-,m, 
k=1 


1973] CLASSROOM NOTES 429 


where ¢,,02,°°:,d,, are the elementary symmetric functions of the zeros x,,-++,x,). 
In this case we have 


(2 ens ' _ 2m(2m — 1)++ (2m — 2r + 1) 


CY) = lana 1 ~ Qr+1)! 


2r 


— Gr + ii m* 4 O(m?"~*), 
for r = 1,2,---,m. Using this with (9) we find 


2r+2k-1 
2 By, 


2r 2r-1 
QOidrtizip™ tm”) 


(— 1)" *s0,-4 = (— 1)" 


so (10) becomes 


2r( _ 1)'2?7"- 1 r-1 9) 7B Ob 


— a 2S 2r yy is2r-1,,2r 2k 2r—1 
*r Qr+iyr ™ TOD 20 om" 2 apiiGr¢1aapr 1 Om) 
1 r-1 > 2kp 
— —1\'o2r-1,2r Jo DK 2r-1 
(— tan Lisi  GilOr+1—aeiy tO"). 


Now we use Lemma 2 (stated below) to evaluate the expression in braces and we find 


aar-it _ 
B>, m2" + O(m*" *), 


— =O IY ay 


which proves (9) by induction. 
4. A lemma on Bernoulli numbers. 
LEMMA 2. If r = 1 we have 


r 27**B., 1 
(12)  @blGr4+1 26! Gn 


Proof. Let B,(x) denote the Bernoulli polynomial defined by 


(13) B(x) = = (" x" 
s=0 s 
or, equivalently, by the power series expansion 
XZ 10.0) B 
ze HX) on 


n=0 n! 


; | z| < 27. 


A well-known property of B,(x) is the functional equation 
(14) B,(1 — x) = (— 1)"B,(x) 


which follows at once from the identity 


430 T. M. APOSTOL [April 


zeli-x)z —ze 


e—1 e-?—-1' 


Equation (14) implies 


(15) Borat (5) = 0. 


Formula (12) is a disguised form of (15). We use (15) along with (13) and multiply 
by 27"*! to obtain 


In view of (3) this becomes 


("7 ‘} 2B, + 5 xh ‘} kB = 0, 


I k=0 
which is the same as (12). 


5. Concluding remarks. When the method of section 2 is applied to evaluate 
¢(2n + 1) we obtain the formula 

1 2n+1 1 m ont kn 

(16) ((2n + 1) = (5) lim mnt >» cot Ime’ 


m-? oo k=1 


or its equivalent, 


(17) SE cot2et1 


2nt+1 
Im +1 = (=) (Qn + 1)m?"*! + o(m?"*!) as m—+ oo, 
k=1 


Although (16) expresses ((2n + 1) as a multiple of 2?"*! it is not known if this 
multiple is rational or not. The author has been unable to extend the proof of 
Lemma | to obtain an alternate formula for the asymptotic value (for large m) of 
the sum in (17). All attempts to estimate this sum lead back to (17). 


Note. After this paper was submitted for publication, a paper appeared by Kenneth S. Williams 
[12] on the same subject.Williams also uses the cotangent sum of Lemma 1 in his evaluation of ((2n), 
but his proof, like Euler’s, uses complex function theory and cannot be considered elementary. See 
also I. Skau and E. S. Selmer, Nordisk Mat. Tidskr., 19 (1971) 120-124. 


References 


1, T. Estermann, Elementary evaluation of ¢€(2k), J. London Math. Soc., 22 (1947) 10-13. 

2. L. Euler, De summis serierum reciprocarum, Comment. Acad. Sci. Petropolit., 7 (1734/35), 
(1740) 123-134; Opera omnia, Ser. 1 Bd. 14, 73-86. Leipzig-Berlin, 1924. 

3. Finn Holme, En enkel beregning av 2uf-.11/k2, Nordisk Mat. Tidskr., 18 (1970) 91-92. 

4. K. Knopp, Theory and Application of Infinite Series, Hafner, New York, 1951. 

5, N. Nielsen, Traité élémentaire des nombres de Bernoulli, Gauthier-Villars , Paris, 1923. 


1973] MATHEMATICAL EDUCATION 431 


6. Ioannis Papadimitriou, A simple proof of the formula Dhak? = 27/6, this MONTHLY, 
80 (1973) preceding article. 

7. E. C. Titchmarsh, A series inversion formula, Proc. London Math. Soc., (2) 26 (1926) 1-11. 

8. , The Theory of the Riemann Zeta Function, Oxford, 1951. 

9, R.S. Underwood, An expression for the summation > ae mP, this MonrTHLY, 35 (1928) 
424-428. 

10. J. V. Uspensky, Theory of Equations, McGraw-Hill, New York, 1948. 

11. G. T. Williams, A new method of evaluating €(2), this MONTHLY, 60 (1953) 19-25. 

12. Kenneth S. Williams, On )/7°_ ,(1/n2*), Math. Mag., 44 (1971) 273-276. 


MATHEMATICAL EDUCATION 
EDITED BY J. G. HARVEY AND M. W. POWNALL 


Material for this Department should be sent to either of the editors: J. G. Harvey, Department 
of Mathematics, University of Wisconsin, WI 53706; M. W. Pownall, Department of Mathe- 
matics, Colgate University, Hamilton, NY 13346, 


AN INTEGRATED SEQUENCE IN THE MATHEMATICAL SCIENCES 
FOR UNDERGRADUATE BUSINESS STUDENTS 


R. H. RANDLES AND A. J. SCHAEFFER, University of Iowa 


The courses in mathematical science (mathematics, statistics, and computer 
programming) which are required for every business student vary widely among 
colleges and universities. In a recent sample survey of midwestern universities, 
Rodger Collons [1] found that among the 30 schools surveyed on the semester 
system, the required hours fell between the extremes of 0 and 21. The median of the 
required hours among those 30 schools was 9, A typical program might therefore 
consist of one 3 hour course each in mathematics, statistics, and computer program- 
ming. Itis the purpose of this article to describe a sequence of two 4 semester hour 
courses developed at the University of Iowa in which topics from the three areas of 
mathematics, statistics and computer programming are blended together in an effort 
to increase the motivation of each of these subject areas. It is hoped that in so doing, 
the student will acquire more of an overview of the mathematical sciences and how 
techniques from all three disciplines lend themselves (possibly in conjunction with 
one another) to the solution of business problems. This article contains some of 


the details of this sequence and some suggestions for integrating topics from the 
mathematical sciences. 


1. Course structure. The number of students entering this sequence each year is 


434 ELEMENTARY PROBLEMS AND SOLUTIONS [April 
ELEMENTARY PROBLEMS 


Solutions of Elementary Problems should be sent to Problems Group, Mathematics Department, 
University of Maine, Orono, ME 04478.To facilitate their consideration, solutions of Elemen- 
tary Problems in this issue should be typed (with double spacing) and should be mailed before 
July 31, 1973. Contributors (in the United States) who desire acknowledgment of receipt of 
their solutions are asked to enclose self-addressed stamped postcards, 


E 2408. Proposed by Bernardo Recamdn, Colegio San Carlos, Bogotd, 
Colombia 


A natural number is a decimal Colombian number if it cannot be written as 
m + s(m) for any natural number m, where s(m) denotes the sum of the digits of m 
when m is expressed in decimal notation. For example, 28 is not a decimal Colombian 
number since 28 = 23 + 2+ 3, whereas 9 is a decimal Colombian number. Base n 
Colombian numbers are defined analogously. 

Prove that in any base there are infinitely many Colombian numbers. 


E 2409. Proposed by A. V. Boyd, University of the Witwatersrand, Johannes- 
burg, South Africa 


Sum the series 
e-8) on -1 
4x)". 
2 ( n ( *) 
E 2410. Proposed by Barry Wolk, University of Manitoba 


Evaluate 
> (— 1)’ n+r 
y-90 (nh tr)(2r+1)\ 2r } 
E 2411. Proposed by F. W. Barnes, University of Michigan 


Let G be a group. Give sufficient conditions on a and b so that (xy)* = x*y* and 
(xy? = x’y? for all x, yeG, force G to be commutative. The conditions must be 
general enough to imply the result for a = 8 and b = 11. 


E 2412. Proposed by Michael Goldberg, Washington, D.C. 


If a line segment AB of unit length is rotated 180° about the fixed end B to the 
position BA’, then the end A makes a track of length x. However, if the end B is 
allowed to move also, then the sum of the lengths of the tracks made by A and B 
can be shorter. Note that if a portion of the track is retraced, then the motion is 
increased, but the length of the track is not increased. What is the shortest total 
length of track needed to carry the line from AB to BA'? 


1973] ELEMENTARY PROBLEMS AND SOLUTIONS 435 


E 2413. Proposed by C. B. Grosch, Control Data Corp., Minneapolis 


Consider an oblate spheroid and the circle in its equatorial plane which is the 
locus of the foci of meridian ellipses. Show that any ray that originates on the circle 
will be reflected to the circle after a reflection from the interior of the spheroid. (On 
the spheroid, the angle of incidence equals the angle of reflection.) 


SOLUTIONS OF ELEMENTARY PROBLEMS 


Fibonacci’s Rabbits Run Again 
E 2350 [1972, 393]. Proposed by H. D. Ruderman, Hunter College High School 


A total of n fair coins are flipped and laid in a row. What is the probability that 
in the row neither the combination HTH nor the combination THT occurs anywhere? 


I. Solution by N. J. Fine, Pennsylvania State University. Let A,, B,, C,, D, be 
the numbers of successful n-sequences ending with HH, HT, TH, TT respectively. 
Thus A, = B,=C,=D,=1. The required probability p,=27" (A, +B,+C, 
+ D,). By adjoining an H to each sequence of type A,, or C,, we get one of type A,,,;, 
and every successful sequence of type A,,,, 1s obtained in this way. Hence 


Anet = A,, + C,,. 


Similarly, B,41= An> Ca41= Das Dng1= D,+B,. By symmetry, D,= A, and C,= B,,. 


Hence 
Ang, =A, + B, =A, +An,-1 (0 2 3). 


Let f, be the nth Fibonacci number. Then A, =1=f/,, A, =2=/;. Hence 
A, =f, for all n 22, and 


DP, = 2-"(2A,, + 2B,,) = 2-"*1(A, + A,-1) = 2 "tT Ansts 
so finally 


Sn+1 
Pn = Qn-1 ° 


II. Solution by R. J. Dickson, Lockheed Palo Alto Research Laboratory. 
Mark an S or D between each pair of adjacent coins according as they show the same 
or different faces. A sequence of n — 1 Bernoulli trials with probability 4 results, and 
the probability sought is the same as the probability that this sequence does not 
contain two successive D’s. This is a special case of the “‘problem of runs’’ treated by 
de Moivre in his ‘‘Doctrine of Chances’’; cf. Uspensky, Introduction to Mathe- 
matical Probability, McGraw-Hill (1937), 77-84. 


Also solved by the proposer and fifty-three others. 
Other references for the S-D sequence of solution II include problem A-5 of the 1956 Putnam 
Exam, and problem E 2022 in this MONTHLY [1968, 1117] (located by D. M. Bloom), problem B-236 


436 ELEMENTARY PROBLEMS AND SOLUTIONS [April 


in the Fibonacci Quarterly, 10 (1972) 330 (by Graham Lord), and Ivan Niven, Mathematics of Choice, 
Random House (1965), 50-52 (by D. P. Sumner). J. V. Michalowicz presented a computer-generated 
list of the probabilities p, which included ps5 = 0.500, p13 = 0.092, pog = 0.009, p35 = 0.0009. 


A Number-theoretic Inequality 


E 2351 [1972, 394]. Proposed by Stefan Porubsky, Comenius’ University, 
Bratislava, Czechoslovakia 
Let @ denote Euler’s totient function and let t(n) denote the number of divisors 
of n. Show that 
p(n) [e(n)}? Sn? 
for all positive integers n #4. For what n does equality hold? 


Solution by M. G. Greening, University of New South Wales, Australia. Set 
b(n) [2(n)}? [n? =f (0). 

(i) Clearly f(1) = 1, f(2) = 1, f(4) = 9/8 and f(3) = 8/9. 

(ii) f(p) = 4(p — 1)/p? <1 for p an odd prime. 

Also f(p*t*)/f(p%) = p(w + 2)? /(@ + 1)7p? $9/4p for «>0, whence induction 
shows that f(p*) < 1 for all positive a. 

(iii) f (2%) = (@ + 1)? /2*** <1 for a > 3. f(8) =1. 

(iv) From (i) and (iii), f(4k) =1 demands k = 2 or (k,2) = 1. 
In the latter case, f(k) = 1 /f(4) = 8/9, so that 3 | k. As f(3") < 8/9 for « > 1, by (ii), 
and f(p*) = 1 demands p = 2, while/f is multiplicative, the only solution of f(k) = 8/9 
is k =3. 

Hence, equality holds only for n = 1,2,8,12. 


Also solved by the proposer and forty-one others. 


A Monotone Decreasing Sequence 
E 2352 [1972, 394]. Proposed by Marlow Sholander, Case Western Reserve 
University 
For each positive integer n, define 
1]” on! 
0, = 1 n ~| | 
n nor 
n"./n 


Show that the sequence {Q,} is monotonely decreasing and find its limit. 


Solution by St. Olaf College Students. It is easily calculated that 


1 (n+1)2 
1 +4] n n 
On+1 _ ( n+ 1 _ ( _ 1 ) “(i 1 -| +h2 — S1+S2 
Q,, 1 n2+n+1/2 (n + 1)? n 
(1 +=) 


1973] ELEMENTARY PROBLEMS AND SOLUTIONS 437 


ke pase *2 = 2OeD (;- ~ Xk =) nti 


are familiar convergent series for all positive integers n. 

Note that S, is a series of negative terms and S, is an alternating series whose 
terms in absolute value decrease monotonely. Consequently the sums of both series 
are less than their respective first terms. Thus it follows that 


1 1 
"24D? * 12n? 


S,;=- 


S,;+S8,< <0, n = 1,2,--: 


This proves that Q,.,/0, <1; that is, {Q,} is monotonely decreasing. 
Since Q,> 0, it follows that the sequence {Q,} possesses a limit. The limit is 
easily calculated if n! is replaced by the familiar (n/e)” ./2nn. Thus 


1\" _ 
lim,...0, = lim, (1 + a} em" In 


— \/2n my + €XP( In(1 + 7 — n} 


An ; 1 1 1 
= /2n exp(lim, ..( ~3T3,7-G,t ~)] 
= /2n/e. 


Also solved by M. T. Bird, Peter Bundschuh (Germany), Frederick Carty, R. J. Dickson, Mich- 
ael Goldberg, Richard Groenweld, Emil Grosswald, Sidney Heller, G. A. Heuer, Hans Kappus 
(Switzerland), Barbara A. Keller, Charlotte Krauthamer (Austria), O. P. Lossers (Netherlands), 
B. E. Rhoades, G. S. Rogers, T. Salat (Czechoslovakia), Leroy Sathre, F. G. Schmitt, Jr., W. C. 
Sisarcick, F. C. Smith, T. A. R. Stettler (Switzerland), L. M. Young, and the proposer. 


Optimal Sequence of Products 


E 2353 [1972, 394]. Proposed by J. G. Rau, Litton Systems, Culver City, 
California 


Given two sequences {a,,a3,---,a,} and {b,,b,,---,b,} of positive real numbers, 
find the permutation (j,,---,j,) of the integers 1,2,---,n for which 
XL by4j, 


1k=1 


1 Mes 


m 
is @2 minimum. 


I. Solution by W. O. J. Moser, University College, London, England. Let S 
denote the sum when 


438 ELEMENTARY PROBLEMS AND SOLUTIONS [April 


CiisJ20°**sdn) = (1,2,--,n); 


let S’ denote the sum when 


CiisJa.°** Jn) = (1,2,-+-,k _ 1, k + 1, k,k + 2, k + 3,°°', MN). 

Then 

’ Ak+1 Ak 

S’=S + dydyay — O41, = S + bby a (pe — ) , 
bya dy 
so S’<S if and only if ays, /Dea1 < ay/b,. 
Letting r; = a,;/b;, i =1,---,n, we see that the sum is minimum when (j,,/2,°°+,Jn) 

is such that 


ry, Shy, S S7y,. 


Il. Solution by R. J. Dickson, Lockheed Palo Alto Research Laboratory. 
Set Cum = 4;,b;,, C = || Cum], and let C = L+ U + D denote the decomposition of C 
into lower, upper, and diagonal parts. Let s(A) denote the (linear, homogeneous) 
functional equal to the sum of the elements of a matrix A. The sum to be minimized 
by choice of permutation is then s(D + U). Since s(C + D) is independent of the 
permutation, the identity D+ U =4(C + D) +4(U — L) shows that an equivalent 
problem is to minimize $s(U — L) = Liem 4(Ckm — Cmk). Except for signs, the 
(,) terms of this sum are the areas of all triangles determined in the first quadrant 
by the origin and any two points selected from the set with coordinates (a;,5;), 
i=1,2,---,n. The choice of permutation therefore affects only the distribution of 
signs, and the minimum of the sum is attained on any permutation which attaches 
negative signs to all the non-zero terms. The permutation sought is therefore one for 
which b,,/a;,, k =1,2,-:+, n, is a non-increasing sequence, and is unique only if 
there is no pair of the points {(a;,b,)} collinear with the origin. 


III. Comment by L. P. Prostanstus, Naval Electronics Laboratory Center, 
San Diego, California. The solution to this problem has been published in J. G. 
Rau, Minimizing a function of permutations of n integers, Operations Research, 19 
(1971), 237-240. (This paper describes some applications of this problem, such as 
minimizing expected cost of testing certain multi-component devices to find cause 
of failure. ~Ed.) 


Also solved by the proposer (see Comment), Peter de Buda, Sidney Heller, J. R. Kuttler, O. P. 
Lossers (Netherlands), L. P. Prostanstus (see Comment), F. G. Schmitt, Jr., and Nan-Shan Shou. 


Derangements 


E 2354 [1972, 394]. Proposed by L. Carlitz and R. A. Scoville, Duke University 


Let S = {1,2,---,n} and let D, denote the number of permutations of S with no 
fixed points (derangements). Let E, denote the number of even permutations of S 


1973] ELEMENTARY PROBLEMS AND SOLUTIONS 439 


with no fixed points. Show that 


E, = (5) Dna —(-1)(n-1),  n=2,3,-". 


Solution by Bob Prielipp and N. J. Kuenzi, University of Wisconsin, Oshkosh. 


The following well-known formulas are given in the solution of Problem E 907: 


(1) D,=n! EZ (—1)fi, 2) Ey = {Dy —(— Mn — Y}/2. 


i=0 


(See pp. 687-688, December 1950 issue of this MoNTHLY.) 
Using (1) we have 


3). 
{D, + (— "(a — 1)}/2. 


Using this last equation and (2) we have the desired result: 


| 

= 
mtr, 
tt 


(— 1)'/i! —(— It (n — 1)! — (— 1)" /n! |/2 


(5; ) Pes ~(-1"(n- 1) = {D, - (= 1)" — 1} /2 = Ey. 


Also solved by Marc Berger, Problem Solving Group Berne (Switzerland), D. M. Bloom, Peter 
de Buda, J. Chone (France), M. G. Greening (Australia), Wells Johnson, Harry Lass, O. P. Lossers 
(Netherlands), L. E. Mattics, W. O. J. Moser, P. J. Murray, W. J. Sanchez, F. G. Schmitt, Jr., Allen 
Stenger, E. T. H. Wang, and the proposers. 


An Early Jacobi 


E 2357 [1972, 518]. Proposed by M. D. Hirschhorn, Penicnik, Midlothian, 
Scotland 


Suppose that m and n are nonnegative integers and that xo, x,,---,X,, are distinct. 
Show that 


Lx xen =e Ye 
where the sum on the left-hand side is over all (Ko, k,,---,k,,) with k,;20 and 
ky + +++ +kK,, =n, and where the product is over all j ¥ i. 


Solution by M. G. Greening, University of New South Wales, Australia. Set 
b; = xj"*"/[ iz;(x; — x;). The left-hand side of the given identity is the coefficient of 
x~' in the Laurent’s expansion of f(x) =x"*"/[]¥=20 (x —x,), where 


|x| > max {xo,X1, °°) Xmhs 


as f(x) =x" "T]f20 (1 — x,;/x)7?. 


440 ADVANCED PROBLEMS AND SOLUTIONS [April 


On the other hand, decomposition of f(x) into partial fractions by the ‘‘cover- 
up’’ method gives: 


fe) = 


iMs 


b;/(x — x) = y bjx~*(1 — x,/x)~! 
i=0 


and the coefficient of x7! 


of the identity. 


in this expansion is 2j"_> b, which is the right-hand side 


Also solved by Peter Bundschuh (West Germany), Leonard Carlitz, J. E. Chance, R. J. Evans, 
S. H. Halton, A. C. Hindmarsh, M. S. Klamkin, O. P. Lossers (Netherlands), Helen M. Marston, 
M. R. Railkar (India), A. G. Shannon (Australia), Nan-Shan Shou, R. P. Soni, C. H. Yong, and the 
proposer. 

Editor’s comment. As pointed out by Klamkin and Railkar, this result is not new but is a result 
in the theory of alternants. For an alternate solution see Theory of Determinants by Muir and Metzler, 
paragraphs 333 and 335. 


ADVANCED PROBLEMS 


All solutions of Advanced Problems should be sent to J. Barlaz, Rutgers —The State University, 
New Brunswick, N. J. 08908. Solutions of Advanced Problems in this issue should be typed (with 
double spacing) on separate, signed sheets and should be mailed before July 81, 1973. Contribu- 
tors (in the United States) who desire acknowledgement of receipt of their solutions are asked 
to enclose self-addressed, stamped postcards. 


An asterisk (*) means neither the proposer nor the editors supplied a solution. 


5906. Proposed by Gérard Letac, University of Clermont, France 


Let Xo, X1,°°:, X,, °°: be independent random variables such that P(x, = n) = p, < 1 
for all t and n = 0,1,2,--- with Lop, = 1; let gq, denote P(x, <n). Find the 
distribution of 


Axo + Pxo4x poet PxoPx1°°* Pxe~ sxe rot 
5907. Proposed by J.C. Alexander, University of Maryland 


For n = 1, let S, be the set of polynomials of the form 


2 


P(Z) = 27+ yz” + Agigz” 7 +0 +ayz4+1, 


where 4,,4>,°::,d,-, range through all complex numbers. What is the value of 


M, = min (max | (z)|)? 
peS, \|z|=1 


5908. Proposed by M. L. Glasser, Battelle Memorial Institute 


Prove 


1973] ADVANCED PROBLEMS AND SOLUTIONS 441 


where A = i7 + j7+k?4+17+i+j+k+14+1 and the summations are over all 
non-negative integers. 


5909. Proposed by Gérard Letac, University of Clermont, France 


Let f be a continuous real function on some real, finite dimensional vector space 
E. For any base b = (b,,-:-,b,) of E, denote by E, = {z,b, +--+ +2,),3 
zi€Z, i = 1,---,n}, where Z is the set of integers. Is it true that f is a bounded 
function when, for any base b, f restricted to E, is bounded? 


5910*. Proposed by P. M. Eakin, University of Kentucky 


Suppose A and B are rings, n an integer and X,,-:-,X,, Y1,°°:, Y, indeterminates. 
If A[X,,---,X,] = BLY,,---, ¥,] and A is euclidean, then B is euclidean. 


SOLUTIONS OF ADVANCED PROBLEMS 
Powers of an Algebraic Real Number 


5832 [1972, 93] Proposed by Irwin Just, Bronx Community College 


Let n be an integer greater than one. Must there exist an algebraic real number r, 
of degree n such that for each positive integer m, [r”] is an odd integer? ([x] is the 
greatest integer not exceeding x.) 


Solution by Leonard Carlitz and Richard Scoville, Duke University. We shall 
prove the following more general result: Let n, k be integers > 1. Then there exists 
an algebraic number r of degree n such that 


[r"] = -—-1(modk) (m=1,2,3,---). 


Proof. Let k be an integer > 1 and let r,,7r,,---,r,, n = 2, be the roots of 
f(x) = XL a,x' 
=0 


where the following conditions are satisfied: 
(1) a, = 1. 
(2) all a; are integers with ag =a, =-++- =a,-, =0 (mod k). 
(3) fis irreducible. 
(4) all r, are positive andr, >r,>1r3,>-::>YP,,. 
(5) 2% r,<1. 
i=2 
Then from (1) and (2) it follows that 


i=1 


442 ADVANCED PROBLEMS AND SOLUTIONS [April 


is an integer and that L, = 0 (mod k) for every j 2 1. Then, by (5), 


so we get[ri]=—1(modk) (j21). 

Now to solve the problem we must show the existence of a polynomial f, satisfying 
(1)-(5). If (1), (2), (4) and (5) are satisfied, so is (3), since if a product is integral, the 
product must contain r,, by (5). 

Let f,(x) = x? — Bkx + k where B is an integer chosen so large that (5) is satisfied. 
Hence (1)-(5) are satisfied. 

Now assume f,,_ , has been chosen to satisfy (1)-(5). We will see that an integer A 
can be chosen so large that 


SAX) = x" — f,-(Akx) 


also satisfies (1) to (5): let r; be the largest root of f,_,. Then f,_, changes sign n —1 
times between 0 and r; + 1. Hence —f,,_ ,(Akx) changes sign n — 1 times between 
0 and (r, + 1)/Ak. If we choose A sufficiently large the effect on x" is negligible and 
f(x) = x" — f,-,(Akx) will change sign n—1 times between 0 and (r, + 1)/Ak. 
Furthermore f,((r; + 1)/Ak) < 0, since f,_,(r; + 1) > 0. Thus f, has n — 1 roots 
between 0 and (r;, + 1)/Ak and another large positive root. Clearly by choosing A 
still larger, we can ensure (5). 

We may also prove that if k is a fixed positive integer, there exists a real algebraic 
number r of degree 2 for which 


[r"] = 1 (mod k), m = 1,2,3,-. 


Also solved by J. G. Mauldon, P. L. Montgomery, and the proposer. 


Functions with a Limit 


5833 [1972, 93]. Proposed by J. Bernard and G. Letac, University of Clermont, 
France 


Let f be a continuous function on the positive real numbers such that f(x) S f(nx) 
for any x > 0 and any integer n> 0. Prove that lim,.,,,f(x) exists (S +0). 


Solution by William Wong, The City College, New York. Let 
M = lim sup f(x), m = lim inf f(x) 


rao 60x 2r ro x2r 
and suppose M>m. Choose k, M>k>~™m; then f(a)>k for some a and by 
continuity, there exists b > a such that for all x e[a, b], f(x) > k. Let p = ab/(b — a). 
Now x 2 p implies x/a =x/b+1; thus x/a 2 n2x/b for some positive integer n; 
le., aSx/n<b. Hence f(x/n)>k, f(x) >k for all x = p. This contradicts the 
definition of m and m < M. 


1973] ADVANCED PROBLEMS AND SOLUTIONS 443 


Also solved by C. S. Allen, L. F. Bennett, P. R. Chernoff, R. A. Christiansen, L. E. Clarke (Eng- 
land), J. Cobb, R. O. Davies (England), L. Eifler, N. Felsinger, P. Fisher (Netherlands), J. Fridy, L. 
Gerber, J. Gill, A. C. Hindmarsh, A. A. Jagers (Netherlands), D. R. King, J. Levy, O. P. Lossers 
(Netherlands), J. G. Mauldon, A Meir, S. Monteferrante, P. L. Montgomery, M. Powderly, S. 
Rajnak, W. Snow, J. Sturm, R. K. Tamaki, Nguyen Xuan Uy, H. Van Evelghem (Belgium), F. I. 
Wright, K. L. Yocom, and the proposers. 


Note, Van Evelghem and Mauldon relax the hypothesis by replacing » with u,, u, 7 00, lim 
SUP U,, 4 4/U, = 1. 
Roots of Irreducible Polynomials of Prime Degree 


5834 [1972, 93]. Proposed by Irwin Just, Bronx Community College 


Let f be an irreducible seventh degree polynomial with rational coefficients and 
let S be a proper subset of the zeros of f. Can the sum of the elements of S be rational? 


Solution by Stephen Pierce, University of Toronto. The answer is no, and we 
will prove this in the case that f has degree p, where p is any prime. Let G be the 
Galois group of f regarded as a subgroup of S,. Since P| |G , there is a p-cycle o in 
G, say o = (123--- p). Suppose 4,,---,/, are the roots of f and assume that 


Ay to tA, =r, 


ra rational number. Apply a, o?,---,a?~ * to this equation, thus obtaining p equations. 
Thus, there is a p x p matrix A with elements 0 and 1 such that AA = r, where 


A =(Aq,+++,A,) and r=(r,-:-,7). 


Moreover, A is a circulant of degree p, with not all entries the same. Hence det (A) 4 0 
and A= A7'r. Thus, ,,---,4, are rational, a contradiction. 


Also solved by L. Carlitz, A. A. Jagers (Netherlands), J. Tillman, and the proposer. 


Convex Type Measurable Functions 
5835 [1972, 94]. Proposed by G. Letac, University of Clermont-Ferrand, 
Aubiére, France 


Prove that the constants are the only measurable functions f on the positive real 
line such that for any positive x and y, f(x + y) belongs to the interval spanned by 


f(x) and f(y). 


Solution by J. G. Butler, University of Alberta. Suppose, on the contrary, that 
f(x) is a measurable function on R* assuming at least two values a1, az, a, < a, 
at X = X,,X», respectively. For y > 0, define 


Si(y) = {x:0<x <yand f(x) Say} 
S.(y) = {x:0<x < y and f(x) 2 ap}. 


444 ADVANCED PROBLEMS AND SOLUTIONS [April 


If xe(0,4x,) ~ S,(4x,), then a, =f(x,) lies between f(x) and f(x, — x) so that 
x, —xeS,(x,). Hence 


S1(%1) > Sy4x1) U [x,y — (0, 4x1) ~ S,4x,))] 


and so meas (S,(x,)) 2 3x. 

Now the hypotheses imply f(qx) = f(x), with q a positive rational, so that for all 
y>O0 and all rationals g < y/x,, we have S,(y) >q S,(x,). It follows that 
meas (S,(y)) 24y. Similar reasoning yields meas(S,(y)) 24y. Since S,(y), S2(y) 
are disjoint measurable sets, equality must hold. 

It now follows, using additivity properties of measure that meas (S,(1) 0M) 
= 4 meas(M) for each measurable subset M of (0,1). Taking M=S,(1) yields the 
desired contradiction. 


Also solved by P. R. Chernoff, R. A. Christiansen, R. O. Davies (England), John Gill, Ralph 
Jones, Joel Levy, and the proposer. 


Editor’s note. Chernoff and the proposer note that the hypothesis of measurability is critical; 
for if g is a nonmeasurable additive function then g(x)/x is not constant but g(x + y)/(x + y) 
is a convex combination of g(x)/x and g(y)/y, x,» > 0. 


Minute Translates of a Measurable Function 


5836 [1972, 94]. Proposed by Eric Bedford and Michael Taylor, University 
of Michigan 


Let f(x) be bounded and measurable on (0,1). Is it true that lim,.,,, f(x — 1 /n) 
= f(x) almost everywhere? Prove, or provide a counterexample. 


Solution by Amram Meir, University of Alberta. The statement is false. We shall 
prove that for a given ¢ > 0 there exists a bounded measurable f(x), such that 
; —— 1 
(*) u fx: lim It(x-)-F@| =1}>1-8 
where y{ - } denotes the Lebesgue measure. Let x (0,1), then x can be written in 
the form 


Osa,sn-l1, 


where a, = [n! x] — n[(n — 1)! x]. 
Let N be an integer such that 2~% <e, let n, = 2% ** (k = 1,2,-:-). We define the 
subsets B, of (0,1) by 


B, = {x: a, = 0}, k =1,2,---. 


Clearly, B, is measurable and y(B,) = 27~*~*. Let now 


1973] ADVANCED PROBLEMS AND SOLUTIONS 445 


where B;, denotes the complement of B, in (0,1). A is measurable and yA > 1 
—~ D7.,2°%-*>1 —«. Now let f(x) = 0 for xe A, f(x) = 1 for x € A‘, and let xy be 
an arbitrary point in A. We have 


x= > “a4. £0 for k=1,2,--. 
0 ni’? " 


n=2 


It is easy to see that because of a,, Sn, — 1, 


ne= 


An. _ 1 
(n,)! Mg 


with a suitable integer m,, and that m, — oo. Thus 
| ¢ 
Mm, : 


and so f(xp — 1/m,) = 1, k =1,2,---, while f(x9) = 0. We thus see that x, belongs 
to the set whose measure is estimated under (*). Since nA > 1 — «, the inequality (*) 
follows. 


Also solved by G. J. Butler, R. O. Davies (England), Douglas Lind, and the proposers. 


Editor’s note. Davies raises the question as to the consequence of replacing 1/n by some other 
null sequence. C. J. Neugebauer observes that the sequence f, (x) = f(x — 1/n) converges in mean 
to f(x) and so some subsequence converges to f(x) almost everywhere. 


Inverses in Prime Rings 
5837 [1972, 94]. Proposed by I. N. Herstein and Susan Montgomery 
Cancelled. Duplicate: See solution 5797 [1972, 916]. 


Number of Non-isomorphic Groups of a Given Order 


5838 [1972, 187]. Proposed by R. B. Eggleton, University of Calgary 


Let N(g) denote the number of isomorphism classes of abelian groups of order g. 
The equation N(x) =n is solvable for 1 <n < 12, and for infinitely many other 
natural numbers n, but there is no solution when n = 13. Show that there are in- 
finitely many natural numbers n for which there is no solution. 


Solution by Eric Rosenthal, student, Yale University. It is a consequence of 
the structure theorems for finite theorems for finite abelian groups that if q,, q2,-++, 4, 
are distinct primes, then 


N(q4'Q2 °** Gk‘) = P(e1) P(e) --- P(e), 


446 ADVANCED PROBLEMS AND SOLUTIONS 


where P denotes the partition function. So N(x) = p has no solution if p is a prime 


not in the range of P. 
It is known that the nth prime p,=O(n’), [Problem E 2275 [1972, 89]] and that 


P(n) ~ - exp (,/2n/3). 
4n,/3 
[Marshall Hall, Combinatorial Theory, Blaisdell, 1967, p. 40.] 
So p, = o(P(n)) and most primes are not partitions. If p is one of the infinitely 


many primes outside the range of P then N(x) = p is insoluble. 


Also solved by K. A. Beres, P. X. Gallagher, R. Harris & F. Oliva & R. Thrasher, C. V. Heuer 
& G. A. Heuer, P. L. Montgomery, and the proposer. 


Generalization of the Power Function 


5839 [1972, 187]. Proposed by A. D. Ziebur, State University of New York at 
Binghamton 


The equation x(x, y) = x” defines a function from R* x R to R®* (R is the set of 
real numbers, R* the set of positive reals) such that (x, n) = x" when n is an integer, 
and x(x, yz) = x(x(x, y),z). Is the power function the only function with these 
properties? 

Solution by P. L. Montgomery, San Rafael, California. It is not the only 
function. Let {In 2, In3, In5,---} U B be a Hamel basis for the real numbers. For each 
xéR* let 

Inx= 2% a,Inp+ 2 r,b, 


pprime beB 


where the a’s and the r’s are rational numbers, almost all zero. Set 


a(x) = Ld p*. 
p prime 
Also set o(0) =0 and o( — x) = — a(x) if x >0. Then oa fixes all integers; also 
a(xy) = a(x)a(y) for all real x and y. It quickly follows that the function x(x, y) 
= x’ satisfies the stated conditions, as well as x(xy,z) = (x, z)n(y,z), but is not 
the power function. 


Also solved by O. P. Lossers (Netherlands) and the proposer. 


THE AMERICAN 
MATHEMATICAL MONTHLY 


(FOUNDED IN 1894 By BENJAMIN F. FINKEL) 
THE OFFICIAL JOURNAL OF 


THE MATHEMATICAL ASSOCIATION OF AMERICA 


VOLUME 80 NUMBER 5 
CODEN: AMMYAE 
CONTENTS 

Notice. ... PY) 

The Geometry of Connections . . . . «  « R.S, MILLMAN AND ANN K. STEHNEY 475 

An Introduction to Matroid Theory. . . . . . . . . «. RJ. Witson 500 

Stochastic Equations and their Applications . . . . . G.C. PAPANICOLAOU 526 

The Main Crises. . . . . . . 2. we ew ee ele) Ss. BIRNBAUM 545 
MATHEMATICAL NOTES 

The Sign of the Bernoulli Numbers . ... . . . . LJ. Morpett 547 

The Sign of the Bernoulli and Euler Numbers . . L. CARLITZ AND R. SCOVILLE 548 

Basically Bounded Sets and a Generalized Heine- Borel Theorem . NeIL HINDMAN 549 
RESEARCH PROBLEMS 

A Problem on Rational Functions. . . . . . . . . . . +.AK.PIzer 552 
CLASSROOM NOTES 

A Condition under which a Mapping is a Homeomorphism . . . W. R. Derrick 554 
MATHEMATICAL EDUCATION 

Independent Study for Undergraduates . . . . . . . . +. W.C. RAMALEY 555 
ELEMENTARY PROBLEMS AND SOLUTIONS. . . . 2.0. 6 ee eee et 999 
ADVANCED PROBLEMS AND SOLUTIONS . . . . ee ee ee ee 84 
REVIEWS. . 1. 0. 0. ww ek ee ee eee ee we 568 
NEWS AND NOTICES . . . 1 ww ee ee we ee ww 585 


(Continued on inside cover) 


MAY 1973 


MATHEMATICAL ASSOCIATION OF AMERICA. . . . . . . 2. wee 586 


November Meeting of the Ohio Section . . . . . . . 586 
November Meeting of the Seaway Section . . . . . 2... 5857 
The Fifty-sixth Annual Meeting of the Association . . . . . 1. . L587 
Announcement of the Walter B. Ford Lecture Fund . . . .. .. . .) 595 
Academic Members Elected into the Association. . . . . . . . . ..) © 597 
Calendars of Future Meetings. . . . . . . . ee. ew ee eee «SQ 


NOTICE TO AUTHORS 


Specialized research is usually unsuitable; see Statement of Policy (vol. 76, p.2). Manuscript preparation: Please 
use the Manual for Monthly Authors (vol. 78, p. 1) and follow the format in current issues of the MONTHLY. 
Manuscripts should be typewritten, triple-spaced with wide margins; submit two copies and keep one for 
protection against loss. 

Backlog: Main Articles 12 months, Math. Notes 15 months, Research Problems 7 months, Classroom Notes 
11 months, Math. Education 10 months, 


EDITORIAL CORRESPONDENCE AND MAIN ARTICLES: to ALEx RosENBERG, Department of Mathe- 
matics, Cornell University, Ithaca, N.Y. 14850; NOTES, etc.: to the corresponding Associate Editor; 
ADVERTISING CORRESPONDENCE: to RAouL HAILPEeRN, Mathematical Association of America, 
SUNY at Buffalo, Buffalo, N. Y. 14214; CHANGE OF ADDRESS and SUBSCRIPTIONS: to A. B. 
WILLcox, Mathematical Association of America, 1225 Connecticut Ave., N.W., Washington, D.C. 20036. 


HARLEY FLANDERS, Editor 
ALEX ROSENBERG, Editor- Elect 
ASSOCIATE EDITORS 


JOSHUA BARLAZ J. G. HARVEY SEYMOUR SCHUSTER 

E. R. BERLEKAMP ERIC S. LANGFORD J. ARTHUR SEEBACH, Jr. 
JANE W. DI PAOLA P, D. LAX E, P. STARKE 

ROBERT GILMER ARTHUR MATTUCK LYNN A. STEEN 
RICHARD GUY M. W. POWNALL JAMES WENDEL 

RAOUL HAILPERN GIAN-CARLO ROTA 


Annual dues for members of the Association (including a subscription to the American 
Mathematical Monthly) are $12.50. For nonmembers the subscription price is $18.00. 


PUBLISHED BY THE ASSOCIATION at Washington, D. C., and Menasha, Wisconsin, during the months of January, 
February, March, April, May, June-July, August-September, October, November, December. 


Second-class postage paid at Washington, D. C., and additional mailing offices. 
Copyright © The Mathematical Association of America (Incorporated), 1973 


PRINTED IN THE UNITED STATES OF AMERICA 


NOTICE 


With the January 1974 issue the editorship of the MONTHLY will be taken over 
by Alex Rosenberg, Department of Mathematics, Cornell University, Ithaca, New 
York 14850, U.S.A. All manuscripts for main articles, and all editorial correspondence 
not concerned with previously submitted material, should be addressed to Ithaca 
from now on. The Statement of Policy (Vol. 76 (1969), p. 2) will of course remain 
in full force. 


THE GEQMETRY OF CONNECTIONS 
R. S. MILLMAN, Southern Illinois University and ANN K. STEHNEY, Wellesley College 


Modern geometry was born when Riemann first separated the concept of geo- 
metry from the concept of space. His inaugural lecture at Géttingen, On the Hypoth- 
eses which lie at the Foundation of Geometry (1854), began: ‘‘As is well known, 
geometry presupposes the concept of space, as well as assuming the basic principles 
for constructions in space... . The relationship between these presuppositions is 
left in the dark.’? (From Spivak’s translation [10], Vol. II. See also Smith [9].) 
The purpose of this paper is to see how 119 years of differential geometry and topo- 
logy have shed light on this relationship. We shall first investigate the topological 
basis for modern differential geometry and then add some geometric structure to 
obtain the ‘“‘principles for construction in space.’’ 

In Section 1 we define a differentiable manifold, which plays the role of ‘‘space’’ 
and provides points for our geometries. Riemann called this a continuous mani- 
fold or ‘‘n-fold extended quantity.’’ Here we are in the realm of differential topo- 
logy. To do geometry we need some structure which provides lines (‘‘constructions 
in space’’). The rest of the paper is concerned with three geometric structures, 
involving the use of calculus, which are motivated by the wish that lines (called 
geodesics) have properties analogous to those of straight lines in Euclidean space. 

In Euclidean n-space R", a straight line may be defined as either a curve « such 
that 


d*o 
de ~ 
Richard S. Millman received his Cornell Ph. D. under H. C. Wang in 1971. He held an Assistant 
Professorship at Ithaca College before taking on his present position. 
Ann K. Stehney received her Ph. D. in 1971 at SUNY, Stony Brook, under John A. Thorpe. 
She has been at Wellesley College since then. Both authors work in modern differential geometry. 
Editor. 


475 


476 R. S. MILLMAN AND ANN K. STEHNEY [May 


or a curve which represents the shortest path between points. The first condition 
means that the tangent vector to « is constant along a, or that the derivative of the 
tangent vector in the direction of the tangent vector (i.e., the acceleration) is zero. 

Our first geometric structure (Section 2) is a linear connection or covariant 
differentiation which enables us to differentiate one vector field with respect to another 
and so mimic the first “‘definition’’ of a straight line in R"”. We may then define a 
geodesic as a curve such that the covariant derivative of the tangent vector field 
in the direction of the tangent vector field is zero. Classically, a connection was 
defined by Christoffel (1869) to be a set of symbols im or Ti. The modern view- 
point was finally formulated about 1950 by Koszul. 

We shall see that the concept of a linear connection is equivalent to that of 
parallel translation of vectors along curves. In R”, this corresponds to moving 
the origin of a vector to any point along the curve, keeping its direction and magni- 
tude fixed. The emphasis on parallel translation is due to T. Levi-Civita (1917). 
It was E. Cartan who saw the importance of parallel translation and who introduced 
a more generalized notion of “‘tangent space’’ in order to define a geometry. This 
made Cartan’s original work very difficult to read. However, after 1930 the notion 
of a fiber bundle was developed. By 1950 a theory of connections on fiber bundles 
emerged, due to Ehresmann, Chern, and others, and shed light on a great deal of 
Cartan’s work. We shall return to this in Section 4. 

In our third section we define a Riemannian metric on a differentiable manifold. 
The choice of a Riemannian metric (there are many on any manifold) endows the 
manifold with much structure, for example that of a metric space in the usual sense. 
We may then ask that geodesics be curves which minimize distance. We shall see 
that every Riemannian metric also gives rise to a unique linear connection enjoying 
certain ‘‘nice’’ properties, and that curves which minimize distance are geodesics 
with respect to this connection (but not conversely). The notion of a Riemannian 
metric dates back to Riemann’s 1854 lecture. The association of a connection is 
due to Levi-Civita. | 

We finally return to the notion of parallel translation and show that it is equivalent 
to a unique path-lifting property on a principal fiber bundle. This supplies a geo- 
metric structure, also called a connection, on the fiber bundle. This approach is 
not only useful (see Chern [2]), but with today’s hindsight, also reasonable, because 
it includes Cartan’s theory of moving frames as a special case (see [10], Vol. II, 
Ch. 7). Spivak calls this final approach ‘‘not only more abstract, more elegant, 
and more incomprehensible, but also more general’’ ({10] Vol. I, pp. 8-16). 

To summarize, we present the linear connection first as a basic, minimal geo- 
metric structure which gives ‘‘lines.”’ The existence of a Riemannian metric leads 
back naturally to a connection for which lines have nice length-minimizing properties 
similar to those of Euclidean space. A connection on a manifold is also equivalent 
to a connection on a particular principal fiber bundle. 


1973] THE GEOMETRY OF CONNECTIONS 477 


The reader may notice that, among other important topics, we have omitted 
the theory of differential forms. We feel that the geometric content of forms is harder 
to grasp than the search for geodesics which we have undertaken. We refer the reader 
to Flanders [4] or Chern [2] for this topic. 

Details of the ideas in this paper may be found in Hicks [5] and Spivak [10]. 
An excellent exposition of connections on fiber bundles has been given by Nomizu 
[8]. More detailed accounts may be found in Kobayashi and Nomizu [6], or Bishop 
and Crittenden [1]. 


1. Introduction to Differentiable Manifolds. A differentiable manifold locally 
enjoys the topological properties of Euclidean space. It provides a setting, more 
general than that of Euclidean space, in which the differentiability of functions is 
meaningful. The many definitions we shall give should be examined for their Euc- 
lidean analogue. Since the theorems in this section are versions of results from 
advanced calculus, we shall not attempt complete proofs. 

Recall that a function f from (an open subset of) Euclidean n-space R" into 
m-space R™ is differentiable (C“) if, writing f(x) = (f,(%),:+-,f,,(x)), the coordi- 
nate functions f; have continuous partial derivatives of all orders with respect to 
the coordinates of R". If u,,---,u, denote the coordinate functions on R" and 
D1, ‘*'; Um denote those on R”™, then f; = v;° f and the Jacobian matrix at x eR" 


of f is the matrix 
. 


of the partial derivatives at x of the coordinate functions of f. By the Inverse Func- 
tion Theorem, f has a differentiable inverse in a neighborhood of x if and only if 
J,(x) is non-singular. 

A differentiable manifold M of dimension n is a paracompact Hausdorff topo- 
logical space and a collection {(U,,¢,)|yeT} satisfying 

(i) the U, are open sets in M with M = U,.,U, and each @¢, is a homeomor- 

phism from U, onto an open set in R’, 

(ii) whenever U,U,; is non-empty, the homeomorphism ¢,; o ¢,* from 

¢,(U, AU;) onto @(U,- U5) is C® as a map between Euclidean spaces, and 

(iii) the collection {(U,,¢,)} is maximal with respect to (i) and (ii), that is, it 

contains every (U,¢) satisfying (i) such that ¢ . g,* and ¢, og * are C® 

for all yeT. 

Since {U,} is an open covering for M, each point of the manifold has a neigh- 
borhood homeomorphic to an open set in R", a condition which is sometimes stated 
‘*M is locally Euclidean.’’ The set U, is called a coordinate neighborhood for any 
of its points and the pair (U,, @,) is called a chart for M . Condition (ii) distinguishes 
a differentiable manifold from a topological manifold; we shall refer to it as the 
compatibility of the charts. We shall use ‘‘n-manifold’’ exclusively to mean a dif- 


(1.1) I(x) = Er 


7 Ou; 


}- 


478 R. S. MILLMAN AND ANN K. STEHNEY [May 


ferentiable manifold of dimension n. The collection {(U,,¢,)| yeT} will be called 
a differentiable (or C~) structure for M. Condition (iii) implies that there is a 
unique differentiable structure on M containing any collection of charts which 
satisfies (i) and (ii). In the examples which follow, we shall always mean the differ- 
entiable structure containing the given charts. 


Examples. 1. R" itself is trivially an n-manifold, for example as determined by 
the single chart (R", identity). Similarly any open set in R" is an n-manifold. A single 
chart will not suffice to define the manifold structure, however, unless M is homeo- 
morphic to a (necessarily open) subset of R”. 

2. S', the unit sphere in R?, is a differentiable 1-manifold. Writing 
St = {(x, y)|x? + y? = 1}, we choose the U,’s to be open semicircles: U,,+--,U4 
are the intersections of S’ with the right (x > 0), left (x < 0), upper (y > 0), and 
lower (y < 0) half-planes respectively (see Fig. 1a). We let @, and @, assign to a 
point its second coordinate and @, and @, assign its first. Each $, is a homeomor- 
phism from its domain U, onto the open interval (—1,1). We shall check the com- 
patibility condition for U,; OU, and leave the other cases to the reader (of course, 
U, OU, and U; OU, are empty). Let v be an element of ¢, (U; mM U4) =(0,1). 
Then 
b1 o ba (ve) = $10, -[1 — »*}*) = —[1 — 0°}? 


(see Fig. 1b), and so ¢, 0 64° is C® homeomorphism from (0,1) onto (—1,0). 
Other choices for charts are possible—in fact, only two overlapping coordinate 
neighborhoods are needed. 


(a) (b) 


Fic. 1 


3. S*, the unit sphere in R3, is a differentiable 2-manifold. One choice of charts | 
is analogous to the choice above for S'. The six coordinate neighborhoods are the 
hemispheres x > 0, x <0, y>0, y<0,z>0, and z<0 (ie., the intersections 
of S? with these open half-spaces) and the mappings @,; are orthogonal projections 
onto the coordinate planes. For example, if U, is the hemisphere x >0, then 
, (x,y,z) = (y,z) and the image of ¢, is the open unit disk in the (y, z)-plane. 
Thus in U,, a pair of numbers may be used as coordinates for points. The unique 
point in this hemisphere with coordinates (y,z) is (+ [1 —y? —z7]*,y,z). The 
reader may provide the details that the conditions for charts are satisfied. 


1973] THE GEOMETRY OF CONNECTIONS 479 


4. The general linear group G = Gl(n,R) of all non-singular n x n matrices 
(under matrix product) is an n?-manifold. G is identified with an open set in R"” 
via the map which strings out the matrix entries one row at a time: 


La;;] md (444, "°°5 Ain» 4219°°*s Gan» +) Ann) . 


The determinate function on R” , a polynomial in its variables, is continuous, and G 
is the inverse image under ‘‘det’’ of the open set R— {0}, and therefore open. 
Since det(G) is not connected, G is not. 

Since differentiability is a local condition, we can extend that notion to contin- 
uous maps whose domain is a differentiable manifold and whose range is R or 
any other differentiable manifold. The basic idea is to use the charts to move back 
to Euclidean space. 

Let f be a real-valued function defined on an open subset V of M. Then f is 
called differentiable (or C”) at meVif fo d ° is differentiable at @(m) in R” for 
all charts (U, @) for M at m. If Vis open in M, f is called differentiable on V if it 
is differentiable at every meV. The set of all differentiable real-valued functions 
on V will be denoted C”(V). 

Let M and N be manifolds, possibly of different dimensions, and f be a map 
from (an open set in) M into N. Let (U, fd) and (W,wW) be charts for M and N at m 
and f(m) respectively. Then f is called differentiable at m if ~.f.@~' if differ- 
entiable at ¢(m). This definition reduces to the one above for N = R. It is not 
hard to show that the composition of differentiable maps is differentiable. 


Examples. 1. If (U,@) is a chart for the n-manifold M, then ¢ is C™. If u; 
is the ith coordinate function of R", each x; = u;° @ is a real-valued function 
which is C” on U. The function @ is in fact determined by the n real-valued func- 
tions xX,,---,x,, called coordinate functions or a coordinate system for U. 

2. If M, and M, are n,- and n,-dimensional manifolds respectively, the product 
space M, x M, is a differentiable (n, + n,)-manifold in a natural way. The dif- 
ferentiable structure on M, x M, is the one which makes the projection maps 
™;:M,x M, —- M; (Gi = 1,2) onto the factor spaces differentiable. 

3. Let G = Gi(n,R). The determinant function on G is C®™ since it is the re- 
striction of a polynomial to an open subset of R"’. Furthermore, the group structure 
of G is consistent with its differentiable structure in the following sense. The func- 
tions (A,B) AB from G x G into G and Br B-' from G into G are C® maps 
betwéen manifolds. (Equivalently, the map (A, B)h AB * from G x G into G is C®.) 
Considering G as a subset of R”, the coordinate functions of R” provide local 
coordinates for G in a neighborhood of any point AéG. The coordinates of AB 
(or B~*) are sums of products of the coordinates of A and B (or rational functions 
of the coordinates of B alone), hence all partial derivatives of the maps exist. Simi- 
larly, for fixed AeG, the maps L, (left translation by A) and R, (right translation 
by A), given by 


480 R. S. MILLMAN AND ANN K. STEHNEY [May 
L,(B) = AB and R,(B) = BA, 


are C”® maps from G onto itself. 

Gl(n, R) is an example of a Lie group, a group which is also a C” manifold 
and whose operations (g,h)+ gh and gg‘ are C® maps (from G x G into G 
and from G into G, respectively). 

Just as a differentiable curve in Euclidean space has a tangent vector at each point, 
a differentiable manifold has at each point a collection of tangent vectors which 
form a linear space. If M is a submanifold of R? (a subset whose manifold structure 
is nicely related to that of R ), its tangent vectors at m may be pictured as the tangents 
at m to curves in R? whose images lie in M. (For a definition of ‘‘submanifold,”’ 
see for example [5], p. 13.) Such a viewpoint is, however, not intrinsic, and we shall 
concentrate on the ‘‘directional derivative’’ aspect of a tangent vector. In R",a 
tangent vector at a point assigns to each real-valued function a number, its deriv- 
ative in that direction. 

A tangent vector to M at m is a function X,, from C”(M) to R which satisfies 

(i) (linearity) X,(af+ bg) = aX,,(f) + 6X,,(g), and 

(ii) (product rule) X,,(fg9) = f()X,(g9) + g(M)X,,P), 
for all real numbers a and b and all fand g in C°®(M). For M = R’", it may be shown 
using Taylor’s theorem (see [5]) that any X,,, satisfying (i) and (ii) may be expressed 
uniquely as 


Ea 5, 

L b’'m 

where the a,’s are constants and the summation ranges over i = 1,---,n. In fact, 
a; = X,,(u;). Thus the correspondence X,, < (d,,-::,d,) gives an isomorphism of 
the set of tangent vectors to R" at the point m and the vector space R” itself. 
Since m is also an n-tuple, our X,, is a tangent vector in the classical sense, namely, 
(m, Ajy***s a,) . 

It is easily verified that for any constant function c, X,,(c) = 0. In addition, 
Xf) depends on f only in a neighborhood of m, so that if functions f and g agree 
on an open set containing m, then X,,(f) = X,,(g). Therefore X,,(f) makes sense 
when f is defined only on an open set containing m (see [10], Vol. 1, p. 4-2). 

The tangent space to M at m, denoted T,,M, is the set of tangent vectors to M 
at m, with the usual addition and scalar multiplication for real-valued functions: 


(X + Yn f) = Xm(f)+ Yn) 
(AX) Af) = aXp(S) 
for Xi» Ym € T,M,fEeC°(M), and aeéR. The tangent space has been defined in- 


mo “m 


dependentiy of any coordinate system for M near m; however, the choice of a co- 
ordinate system provides a basis for T,,M. Let (U, @) be a chart for M with meU, 


1973] THE GEOMETRY OF CONNECTIONS 481 


and let x; = u; o @ for i = 1,---,n. Define the function 8/0: |m from C*(M) to R 
by 


Fel O = Gal gaff OD 


for fe C”(M). The right-hand side is the varia derivative of a real-valued function 
on R". The linearity and product rule for partial differentiation imply that 0/0x; lm 
is a tangent vector to M at m. We may write 


a 


The notation 0/0x; suggests derivative in the x,-direction. This interpretation is 
consistent with the identification of U and @(U) in R". 


LemMa. The set of tangent vectors {8/8xi| m} is linearly independent in T,,M. 


Proof. Assume that some linear combination 1 4,(8/O%;) | n is zero in T,,M. 
(All sums are taken over the range i = 1,---,n.) Then for every C® function f, 
Lal Of [8x;) | m = 0. In particular, for x, =u, @(USjSn), 


J 


Ox; 
’ OX; 


a; Ou; —q 
m du; J 


0O= Da 
i (m) 


since du j10U; | oc) = 6;; (Kronecker delta). | 

The proof that {8/0dx; |} spans T,,M is too long to be included here (see [5], 
p. 7). Once that is established, we see that {0/0x; [mt is a basis for T,,M and the 
tangent space at any point has the same dimension as the manifold. With respect 
to this basis, the tangent vector X,, at m is expressed 


(1.2) Xm = EX w(4i) 3 


m 


This method actually provides a basis for T,M for every qe U. If {a;} is a collection 
of real-valued functions on U, » ;a,(0/0x;) assigns to each qéU a tangent vector, 


4) 
2 a(4) a~ 
i i*q 


A vector field X on M is a function which assigns to each me M an element 
X,, 1n T,,M. Every vector field takes the set of real-valued functions on M into itself: 
for f: M—>R, Xf is the function from M to R given by (Xf)(m) = X,,(f). We 
are usually interested only in those which take C*(M) into itself. This property 
may be checked locally, for if X is a vector field on M, the following are equivalent: 

(i) for every fe C°(M), Xf is in C*®(M), and 

(ii) for every chart (U,@) on M, the functions a;: U — R defined by 


482 R. S. MILLMAN AND ANN K. STEHNEY [May 


0 
Ox 


Xn = & a(m) | are C®. 
That is, in U, X is a linear combination of the vector fields 0/0x; with coefficients 
in C”(U). 

If X has these properties, it is called a differentiable (or C~) vector field. X(M) 
will denote the vector space of all C® vector fields on M. Note that if fis in C®(M) 
and X is in X(M), then fX is in X(M), whereas Xf is in C°(M). 

If f: M > N is a C® map between manifolds, any function g in C°(M) may 
be composed with f to yield a function g. fin C®(M). This provides a mapping 
f,, the differential of f, from the set of tangent vectors to M into the set of tangent 
vectors to N. It is perhaps easiest to think of f, pulling functions on N back to M 
for differentiation. If X,, is in T,,M, we specify f,(Xin) © TramyN by giving its value 
on any gEC*(N): 


Lfx(Xm](g) = Xp(g of). 


It is easy to check that for each me M, f, is a linear map from T,,M into Ty). 
When necessary to avoid confusion, we will write Salm for the restriction of f, to 7,,M. 

If charts (U, d) and (V, w) are chosen for M near m and for.N near f(m) respec- 
tively, the differential of f at m may be expressed as a matrix with respect to the 
determined bases for T,M and Tyg_yN. Let {x1,°++,X,,} and {y4,°--,¥,,} be the 
local coordinates on M and N provided by @ and yw. Then 


0 0 0 
fal 3 ) 7 > E Ox; OD a, S(m) 
x Hy; of) <a 
j Ox; m Os! pom) 


With respect to the bases {0/dx; lm} and {d/dy, | s(my} >t is represented by the transpose 
of the matrix 


J,(m) is called the Jacobian of f at m (compare with Equation 1.1). 

The differentials of compositions of C® maps obey a chain rule. If f: M > N 
and g:N—>Lare C®, then for meM, (g oA x lm = Jy | ecm) oS lm: Similarly, 
Jgo s(m) = J(f(m)) 0 J Am). 

By a curve in M, we shall mean a C® map «: I ~ M, where J c R is an inter- 
val. Notice that distinct curves «, and ~, may have the same image in M, for example 
a(t) = (t,t) and «,(t) = (t?,t°), where M = R*. The tangent to « at the point 
t=t,, denoted d&(t,), is the vector a,.(d/dt|,,) IN Ty,)M, where d/dt is the usual 
differentiation operator for functions of a real variable. We shall often consider 


J,(m) = eee? 


Ox; 


1973] THE GEOMETRY OF CONNECTIONS 483 


curves with a compact domain. The tangent to « at an endpoint is then understood 
to be the tangent to (any) C® extension of « to a larger interval. 

An integral curve of a vector field X is a curve « for which a(t) = X,q) for 
all t. For every me M, a C® vector field X on M has an integral curve a defined 
on some interval (--e, ¢) with «(0) = m. It is unique in the sense that if B: (— 6,6) > M 
is also an integral curve of X with B(O) = m, then a(t) = f(t) for all te(—e, 8) 
7) (— 46,5). Expressed in local coordinates for M near m, the condition a(t) = Xyq 
is a system of ordinary differential equations, and the results follow from the fun- 
damental existence and uniqueness theorem for solutions to such systems. 

In the same way that homeomorphic spaces are equivalent topologically, two 
differentiable manifolds are considered equivalent if they are diffeomorphic, that 
is, homeomorphic via a C” map whose inverse is also C®. They necessarily have 
the same dimension. 

Differentiable manifolds are the ‘‘spaces,’’ the point-sets, for differential geo- 
metry. 


2. Geometry as a linear connection. In this section we shall define a geometric 
structure on M, motivated by the fact that in R" we may differentiate one vector 
field with respect to another. In this formulation a ‘“‘straight line’’ has the property 
that the derivative of its tangent vector field with respect to itself is zero. The ref- 
erences for this section are [5], p. 56, and [10], Vol. I, Ch. 6. 

A linear connection (or covariant derivative) is an assignment of a vector field 
VxY to each pair X, Y¢X(M) such that for all X, Y, ZeX(M) and feC*(M), 

Gi) VyyyZ =VyZ4+VyZ, 

Gi) VexyY =fVyY, 

Gil) Vx(Y + Z) =VyY +VxZ, and 

(iv) Vx fY) =f Va¥ + (XPV Y. 

Note that by conditions (i) and (ii), V,Y at m depends only on X at m and not on 
X in a neighborhood of m. Therefore V, Y makes sense and is an element of T,,M. 
The fourth condition makes sense because fYe¢X(M), whereas (Xf)Y is just the 
product of the C® vector field Y and the C®@ function Xf. It is important to note 
that there are many different linear connections on a given manifold, i.e., a space may 
have different geometries on it. We shall see an example of this in the next section. 

A curve «: I > M is a geodesic if V,,a(t) = 0. At first glance this definition 
makes no sense because & assigns a tangent vector only to the points a) < M, 
not to every point in M. With a little work it can be shown that if X is any vector 
field with X,q) = a(t), then V,X(a(t)) is independent of the extension X . By V(t), 
we therefore understand V,X(a(t)) for any such X. 

There is a technical problem here in that if & is zero at some point, it may not 
be possible to find an extension X of &. On the other hand, we are only interested 
in geodesics and it can be shown that, for a geodesic a, if &(t) = 0 for some t, then « 
is a constant curve, and so we shall ignore this technical point. 


484 R. S. MILLMAN AND ANN K, STEHNEY [May 


Example. The standard connection on RR”. If YexX(R"), we may write 
Y, = Xfi) 6/du,|, for some f;¢ C™(R"), since {6/du; |} is a basis for T,R” at 
all peR". We define V,Y(p) = 2; [Xf;](p) 6/éu,|,- In particular, 


V ajou,0/Ou; = QO. 
If a(t) = (a,(t),---,«,(t)), then it can be shown that 


. d*a,; CO 
Va = 2 dt? Ou; 


and so V,a is a measure of the acceleration of a. It is also immediate that the geo- 
desics are precisely the usual straight lines on R". For instance, if «: R > R? is 
the curve a(t) = (cost, sint), then 


, ; 0 i) 
a(t) = (— sin t) iu, + (cos ft) ius 
If X (us .u2) = —up(d/0u,) + u,(0/dup), then X a(t) — X (cost, sin 7 a(t). Hence 
Vay t(t) = VyX(aQ) 
~— — oy | (-ud so + —u Oy 2 buy & 
7 “25u, 1 du; "Ouy *0u,  ‘Ou,|° * du, 
= —u, 4, 
~ TH ay 2 Sy, 


which is non-zero, as we know, because a circle is not a geodesic in R* (Fig 2). 


a(t) 


a(t) 


Fic. 2. 


Geodesics starting at a given point me M always exist, although they may not 
be defined on a very large interval. In fact, the following theorem says that an initia] 
tangent may also be specified. 


THEOREM 2.1. Let m bea point of M. For each X,,€T,,M, there exists a geo- 
desic a:(—&8,8) > M (for some e>0) such that o(0) = m and «(0) = X 


m* 


1973] THE GEOMETRY OF CONNECTIONS 485 


This ‘‘existence and uniqueness’’ theorem is so similar, in spirit and in proof, 
to Theorem 2.2 that we shall postpone a discussion of its proof. 

A vector field along a curve «: I - M is a function Y which assigns to each 
tel a vector Y,q) in T,q)M such that for all feC°(M), t > Yyyf is a C® real- 
valued function on I. If Yis a vector field along a, then Y is parallel along « pro- 
vided V4) Y = 0 for all te2. For the standard connection on R’, Y= XL; f,(0/dui) 
is parallel along « if and only if f; .o « is constant for each i. For example, the basis 
vector fields {0/du;} are parallel along any curve. 

The next two theorems will be essential in the last section. The first is an exis- 
tence theorem. 


THEOREM 2.2. Let « be a curve in M and m = a&(0). For each X,, in T,,M, 
there is a unique vector field Y defined along a such that Y is parallel along « 
and Y,, = Xm- 


Proof (sketch). We do this locally. In a coordinate neighborhood U of (0), 
we may write a(t) = (a,(t),-:-,a,(t)), ie., a = xX; 0a, and a(t) = L; (da,/dt)/(0/dx;) 
by Equation 1.2. We may also write 

7) , O 
Valor: Bx _ 2 Yay, 
for some n® functions Tj,eC*(U). If Y= ZX ,f,(G/dx,), then VagyY = 0 if and 
only if 


da ; ofy 0 Pay 0 _ 
) 2 Ee Ox; Ox, * 2 SiG Taye = 0 


along «, or (evaluating the left-hand side on x,) for each k 


da; Of, “ip _ 
EG att LHGTI| = 0 


or 


(2.1) Bh eo 5 x (fj « Thou = 0 


subject to the condition that the f, 0 «{0) are given to be the components of X,,. 
We now invoke the fundamental existence and uniqueness theorem for ordinary 
differential equations where Ti; and da;/dt are known and f, oa are unknown. 
This will give us (f, © «)(t) for small t. Continuing in this way and using the com- 
pactness of I, we obtain a well-defined vector field Y along a. 

The If, introduced in this proof are the classical Christoffel symbols. In terms 
of them, the condition that « be a geodesic is 


d* ou, , da; da; 
det hl a ae 


486 R. S. MILLMAN AND ANN K. STEHNEY [May 


This system has a unique solution (for small ft) with given {a,(0)} and {da;/dt lo}, 
that is, with given «(0) and «(0), providing a proof of Theorem 2.1. 

Returning to the parallel vector fields along « and the notation of Theorem 2.2, 
let ty): Ta0)M > TygyM be defined by ty)(Xm) = Yigy. The map t, is called parallel 
translation along «, and an examination of the proof of Theorem 2.2. yields: 


THEOREM 2.3. t,4) is a (vector space) isomorphism of Tyo)M onto T,q)M. 


In R” with the standard connection, t, is independent of «, essentially because 
Cy, = 0 in Equation 2.1. In fact 
ij ) 


Cat ( Fa, 


t 
i Ox; 


4) 
= >> a; > . 
a(O) i Ox; a(t) 


This is not the case in most other geometries and this is what makes R" so special. 
If in R” a vector is parallel translated around a closed curve, it comes back to itself 
(i.e., let « be a curve in R” such that «(0) = a(1) = m; then 1,4): 7,,R” — T,,R” 
is the identity). This phenomenon is because R” has zero ‘‘curvature.’’ In this paper 
we are only concerned with connections and other structures which give connections 
and so we shall omit curvature although it is an important concept in differential 
geometry. Curvature is the theme of the second volume of Spivak [10] and we refer 
the reader there. 

It may be surprising that parallel translation is just a global version of covariant 
differentiation. We mean this in the following sense. 


THEOREM 2.4. Giving a linear connection on a manifold M is equivalent to 
giving for each curve a and each t, a vector space isomorphism t,q): T,o)M > TygyM 
such that for every Xo) © Tyo)M 

(i) the assignment t > tyyXqo) is C® and 

(ii) for all AEGl(n,R), 


Ta(t)(AX a(0)) = A(TayX a(0)) . 


Note. If x,,-°-:,xX, form a coordinate system in a neighborhood of a(to), we 
may write 
7) 
T xX 0) — > a(t) —— 
mye) i | OX; a(t) 
for t in a neighborhood of tg, where the a;’s are real-valued functions. Condition 
(i) means that the a,’s must be C® on this neighborhood. 


Proof. We saw in Theorem 2.3 that every linear connection gives rise to parallel 
translation. If we have parallel translation, then given X,, and YeX(M), .we 
define (V,Y)(m) as follows. Let « be any integral curve of X,, with «(0) = m and set 


. Le 
(2.2) (VxY)(m) = iim = (tae Yt) — Y,,); 
to 


1973] THE GEOMETRY OF CONNECTIONS 487 


where the limit is in T,,M which is isomorphic to R”. We refer to [10], Vol. II, 
pp. 6-11 for details. | 

Using parallel translation we may compare the tangent spaces at any two points 
of M which may be joined by a curve a. Explicitly, we may define 
[G2 = tees © Tattoy? Tato» > Tua,yM (which is of course an isomorphism) which 
allows us to compare the tangent spaces. In general, this isomorphism will depend 
on a. 

Formula 2.2 means that covariant differentiation really measures how much 
the given vector field deviates from being parallel. 

Because of Theorem 2.4, we can say that parallel translation is a geometric 
structure. We shall use this interpretation of geometric structure to motivate the 
definition of a connection on a fiber bundle in the last section. 


3. Geometry as a Riemannian metric. In this section we shall add to a manifold 
M a structure (Riemannian metric) which makes M into a metric space. This struc- 
ture will also lead us to a natural linear connection on M and we shall then see 
whether the set of geodesics coincides with the set of length-minimizing curves, 
as it does on R”. 

This theory is appealing because the notion of distance is a familiar one. It will 
necessarily be more complicated than the same notion for Euclidean space, where 
things are measured by comparison with straight lines, and in fact the assignment 
of a length to a single line suffices. In his 1854 lecture, Riemann pointed out that 
there is no reason why the length of a line should be assumed to be independent 
of its position. He proposed measuring infinitesimals (for our purposes, tangent 
vectors, representing velocity) and integrating over a curve to find its length. The 
metric on the manifold is finally given in terms of lengths of curves. The increased 
generality of this method should be obvious; its simplicity lies in our use of calculus 
to investigate geometry. 


A Riemannian metric (or inner product) g for M is the assignment of a positive- 
definite inner product g,, to each tangent space T,,M, which is differentiable in 
the sense that for any C® vector fields X and Yon M, the function m+ g,,Xim»> Ym) 
from M to Ris C®. A Riemannian manifold is a differentiable manifold M provided 
with a Riemannian metric. Every differentiable manifold may be endowed with a 
Riemannian metric, for example, by using a C® partition of unity to piece together 
arbitrary C® metrics on the coordinate neighborhoods ([5], p. 85). 

The length of a vector X,, in the inner product space 7,,M is given as usual 
by || Xm || = Lom(Xm» Xm) 1? 

Let « be a smooth curve in M with domain [a,b]. The length of « is defined 
to be 


Li(“) = | | a(t) || de. 


488 R. S. MILLMAN AND ANN K. STEHNEY [May 


The integral exists since | a(t) | is a continuous function of t. If « is a piecewise 
C®” curve (a is continuous and its domain is the finite union of intervals in which 
a is C”) its length is defined to be the sum of the lengths of the C® pieces. If there 
is no confusion, we shall write L(«) for L(«). 

If M is a cormmnected (therefore path-connected) Riemannian manifold, there is 
at least one piecewise C® curve (in fact, at least one C® curve) joining any two 
points of M. We define a distance function d: M x M > R by 


d(m,q) = inf {L(«) | o is a piecewise C” curve in M with endpoints m and q} 


Clearly d(m,q) = 0 and d(m,m) = 0. It is also true that d(m,q) > 0 unless m = q, 
and that for any third point reM,d(m,q) S d(m,r) + d(r,q) (the triangle in- 
equality). The function d is thus a metric (in the usual sense) for M, determined 
by g. Different g’s will yield different metrics for M, but such metric topologies 
for M are always the original manifold topology ([5], p. 70). 


Examples: 1. As usual, let {u;} be the coordinate functions of Euclidean space. 
Define (the standard) Riemannian metric on R” by 


0 0 | 
w (Surly Gayl) = 8 
In particular, the vectors 0/du, | and 6/0u2| form an orthonormal basis for T,,R’. 


Let « = (a,,«,) be a curve in R*. By equation 1.2 and the definition of «, 


4) 
+ 72 fe) Oa 


. a 0 
a= 7 Ve 0) a 


si [() + 


Let us compute the lengths of the straight and semicircular segments which join 
(—1,0) to (1,0). In the first case, a(t) = (2t — 1,0) for te [0,1]. Since & = 2(0/du,), 
we have ! a(t) ! = 2 for all t, and L(a) = 2. In the second case, a(t) = (—cos zt, 
sinzt) for te [0,1]. Then 


and 


g= nsin nt <—— + cos nt, 
Ou; Ou, 
SO that | X(t) | = mfor allt and L(a) = x. This inner product gives the usual metric 
structure for R? and may be restricted to any open subset of R? to give its usual 
metric structure. 
2. Let H be the open upper half-plane in R? with its usual subspace manifold 


structure. Define g by 
0 0) OF 


1973] THE GEOMETRY OF CONNECTIONS 489 


For ¢>0, let us compute the length of the curve «,:[e,1]—- H given by 
a(t) = (0,t). Since &, = 0/du,, we have | o,(t) | = 1/t. Therefore 


L(w,) = Int|} = —Ine. 


As € approaches zero, L(«,) obviously becomes infinite. 
3. Let h: S? > R° be the inclusion map and let g* denote the standard metric 
on R>. We define the induced metric g on S? by 


g(X, Y) = g3(h,X, h,Y). 


In order to compute its length, a curve in S* is therefore viewed as acurve in R°. 
Consider for example the curve f(t) = (0,sin zt, cos xt) forte[0,1], which follows 
a great circle between the North and South poles. We have 


h,(B) = mcos(xt) = — msin(zt) 5. 
3 


Therefore, | B(t) | = h,[B(d)] | = m and L(B) = zx. This is actually the minimum 
length for curves in S* joining (0, 0,1) and (0,0, —1). 

We shall return to these examples later. In the meantime, we would like to show 
that curves such as $B, for which 


L(B) = d(B(2), B(s)) 


are geodesics as defined in Section 2. We need to obtain a linear connection V 
from the Riemannian metric, hopefully with some geometrically meaningful proper- 
ties. Being motivated by considerations in R”, we are now torn between defining 
a geodesic as a ‘‘self-parallel’’ curve (1.e.,V,¢% = 0) and as a curve which minimizes 
distance. Since the self-parallel condition is a local one, using this definition of a 
geodesic eliminates some problems which otherwise arise. Let us illustrate one 
problem. If the domain of f is (as defined in example 3 above) extended to [0,2], 
the image of f is a great circle on S* and the total length of B is 2x. Obviously, 27 
is not the distance from f(0) to B(2) since B(0) = B2. But the equation above is still 
valid for small (s—t). The (global) length-minimizing property of B is lost when 
its domain is extended. To solve this problem, we shall find a connection so that a 
curve with a self-parallel tangent is locally length-minimizing. 


NoTaTION. For X and Y in X(M), the C@ vector field [X,Y], called the Lie 
bracket of X and Y, is defined by 


LX, Vf = XmOP) — Yn(XPA) 


for all feC*”(M). This makes sense because Yf and Xf are C® functions. 

To define V, we shall specify the inner product of Vy Yand Z,, for all vector 
fields X, Y, and Z at all points meM. Define the Levi-Civita or Riemannian 
connection V by 


490 R. S. MILLMAN AND ANN K. STEHNEY [May 
(3.1) 29m(Vx4,¥>Zm) = Xmg(Y, Z) + Yng(X, Z) — Zmg(X, Y) 


+ Gm(LX3 ¥ Im» Zm) + Gm(LZs X Ins Yn) + Gm(LZ> ¥ Im» Xm) 


for all X, Y, and Z in X(M). (Recall that g(Y, Z), etc., are functions on M.) Ina 
neighborhood U with coordinate functions x,,---,x,, let X; denote the vector 
field 0/éx;. Then [X;, X;] = 0 because ‘‘mixed partial derivatives are equal’’ and 
(3.1) simplifies to 


2g(Vx,X j, Xx) = X; G(X ;, X;) + X ;g(X;, Xi) _ X,g( Xj, Xj) 


in U. A computation shows that V satisfies the conditions for a linear connection. 
This choice of V is partially justified by the following theorem, known as the 
Fundamental Lemma of Riemannian Geometry (for a proof, see [5], p. 71). 


THEOREM 3.1. The connection V defined above is the unique connection on M 
satisfying 
(3.2) XG(Y, Z) = F@(VxY,Z) + 9(%VxZ), and 
(3.3) the torsion T(X, Y)= VyY¥Y —-VyX —[X, Y] is zero for all X, Y, and Z in X(M). 


A linear connection satisfying (3.2) is called a metric connection. The equation 
expresses the directional derivative of the metric in terms of covariant derivatives, 
but it has more significance. Theorem 2.3 showed that the parallel translation opera- 
tor is a critical part of the geometry of a connection. The operator most compatible 
with a Riemannian metric is an isometry [i.e., a map t: T,,M — T,M such that 
J TX ms T Yin) = IGm(X ms Ym)| because a Riemannian metric is nothing more than a 
point-wise inner product. Therefore a ‘‘natural’’ linear connection on a Riemannian 
manifold should have the property that parallel translation is an isometry. The 
following proposition tells us that this is indeed the case for a metric connection. 


PROPOSITION 3.2. Parallel translation is an isometry if and only if (3.2) holds 
for all vector fields X, Y, and Z. 


Proof. Assume (3.2) holds and let « be any curve in M. For Yand Z in T,o)M, 
let Y, and Z, denote their parallel translates, t,,,.¥ and t,Z respectively, to a(t). 
Since Y and Z are parallel along «, if V satisfies (3.2), then «(t) g(Y,Z) = 0. Hence 


d d 
0 = ay (5) 9% 2) = Fao %o2) 
and SO g4)(Y,,Z,) is constant, in other words 


Gact)( Yt> Z,) = Jao %; Z), 


which is precisely the statement that t,,, is an isometry. 
Assuming that t,,,, is an isometry for any curve a, let X,,be in T,,M. To verify 


1973] THE GEOMETRY OF CONNECTIONS 491 


(3.2) at m, let w be any curve with «(0) = m and «(0) = X,,. Both sides of (3.2) 
depend on Y and Z only along «. First consider vector fields Y and Z which are 
parallel along «. Then at m the left-hand side of (3.2) is zero since g(Y, Z) is constant 
along «. The right-hand side is zero since Vy Y=Vy Z=0. We have therefore 
verified (3.2) for this case. Now let {X,} be an orthonormal basis for T,,M and 
let {X,(t)} denote its parallel translation to a(t). Since parallel translation is an 
isometry, {X;,(t)} is an orthonormal basis for T,,,.M . Arbitrary differentiable vector 
fields Y and Z along « can be expressed Y,,,) = 2; a,(t)X,(t) and Z,q) = 2; 5(t)X,(d), 
where the a’s and b’s are differentiable. Then 


te( |) (82) 


d 
= dt | Jae Yuctys Zt) 


Xng(Y, Z) 


d 
= an 2 a()bi(?) 


and 


da. 
Gul Vi ¥o Zn) + In FonVyZ) = On (E [a0 2,Xi + GE] Xa] E BOX) 
0 J 


i 


+ Gm (= a(O)X;, & [PAV X, + GE “i | 
i j ° 


x,] 
0) 


° b: 
= Im (= a Xi » b(X,] + Im (= aO)X;, >» db; 
i dt 0 Jj i j dt 
because V y,,X; = 0. Since {X;} is orthonormal, this is equal to 
da, db. 
Gai (0) #2: | 
ies | (0) + a(0) 


which is X,g(Y¥,Z). | 
A connection satisfying (3.3) is, locally, as close as possible to the standard 
Euclidean connection by the following theorem (see [10], Vol. II, p. 5-18 for a proof). 


PROPOSITION 3.3. The torsion of a connection V7 is zero if and only if for every 
méM, there is a coordinate system {x,} for M near m so that 


7) 
V aox, ax. = 0, 
J 


or in Classical notation, Ti; = 0. 


A connection satisfying (3.3) is said to be torsion-free. Geodesics are determined 


492 R. S. MILLMAN AND ANN K. STEHNEY [May 


by a connection, as in Section 2, and metric connections are determined by their 
torsion. With our interest in geodesics, we note that two different metric connections 
may have the same geodesics (see [1], page 131, or [7]). By a geodesic of a Riemannian 
manifold we shall mean a geodesic of the Levi-Civita connection. Notice that 
differentiating g(a(t), a(t)) with respect to t shows that the tangent vectors to a geo- 
desic « have constant length. | 

We come finally to the role of lengths of curves. 


THEOREM 3.4. For every point m in a Riemannian manifold M, there is a 
positive number 1 and a neighborhood U = {q eM|d(m, q) <1} such that 

(i) any points q and r in U may be joined by a unique geodesic « whose image 
lies in U (unique up to a linear reparametrization), 

(ii) the geodesic a is the unique curve joining q to r which has length d(q,r,) 
and 

(iii) for each qeéU, there is a local coordinate system about q in which all 
geodesics « in U with a(0) = q have the form a(t) = (a,t,---,a,t) with a,,---,d, 
constant. 


A proof may be found in [6], Vol. I, p. 166. This theorem says in particular 
that geodesics connecting nearby points have minimal length among all curves 
with the same endpoints. Furthermore, any point may be joined to any nearby 
point by a unique, length-minimizing geodesic. Locally the situation is similar 
to R"; globally, however, this is not the case. 

Given m and q, arbitrary points in a Riemannian manifold, there is not neces- 
sarily a curve of length d(m,q) which joins m and q. For example, in R*— {(0,0)} 
with the usual Riemannian metric as an open subset of R?, every curve from 
m = (—1,0) to gq = (1,0) has length at least 2 and for any ¢ > 0, there is a curve 
joining m and q with length less than 2 + ¢, hence d(m,q) = 2. However, there is 
no such curve of length 2. This phenomenon is due to the fact that R? — {(0, 0)} 
is not a complete metric space, as we shall see. 

A manifold M with a linear connection V is called geodesically complete if 
every geodesic a: [a,b] — M determined by V can be extended to a geodesic with 
domain R. 


THEOREM 3.5. (Hopf-Rinow). A Riemannian manifold (with the Levi-Civita 
connection) is geodesically complete if and only if it is complete as a metric space, 
in which case any two points m and q in M may be joined by a curve of length 


d(m, 4). 
A proof may be found in [10], Vol. I, p. 9-55. The curve in Theorem 3.5 is ac- 
tually a geodesic, as the next theorem shows. 


THEOREM 3.6. If a is a curve whose length realizes the distance between its 
endpoints, then a is a geodesic. 


1973 | THE GEOMETRY OF CONNECTIONS 493 


Proof. Such a curve « must realize the distance between any two points in its 
image, or else a shorter (perhaps only piecewise C®”) curve with the same endpoints 
could be found by replacing a section of «. Theorem 3.4 now guarantees that a is 
a geodesic in a neighborhood of every point in its image, therefore « is a geodesic. | 

The Christoffel symbols for the Levi-Civita connection are given in terms of 
the metric by Tj, = g(Vx,X;,X,), where {X; = 0/0x,} is an orthonormal basis for 
T,,M . For actual computations, it is convenient to write the metric as a matrix- 
valued function on coordinate neighborhoods [g;;] = [g(X;,X,)] with inverse 
g_'. The Christoffel symbols may then be computed: 


_1{ 09; 0g; Ogi; 
. Tr. = 1{ C9in ih _ ©9ij \ 
(3 4) ij 4 2 ID hk (3 + Ox; Ox, 
Let us revisit the examples of this section. All complete Riemannian manifolds, 
they represent the classical Euclidean, hyperbolic, and spherical geometries. 


1. In R" with its standard metric, differentiating the constant function 
g(0/du,,0/du,) with respect to d/du, shows that V4/o,,0/0u; = 0 for all i and j. The 
Levi-Civita connection of the standard metric is therefore the standard connection. 
The fact that Ty = 0 for all i, j, and k was known to Christoffel. In this geometry, 
two distinct points lie on a unique line, and given a line and a point not on it, there 
is a unique line through the point which does not intersect the given line. By “‘line’’ 
we mean the points in the image of a geodesic defined on R. 


2. The Poincaré upper half-plane. Let H < R* be the upper half-plane with 
the metric given as before by g;; = 6,,/uz. Then gji = 922 = uzandg;,' = 95; =0. 
We can easily compute the Christoffel symbols for H using Equation 3.4: the only 

1 
non-zero ones are I}, =>, = —I7, = 13. = —=3: The equations for a geo- 


2 
desic « = (a,,0,), where 7, >0, now become 


d’oy 2 dy doy _ 
dt? a(t) dt dt — 


ot + oy (S) - ay (B) = 
dt? a(t) \ dt a(t) (Fi 7 


The only solutions to these equations are 


and 


a,(t) = a + btanh (rt +c), «,(t) = Db sech (rt + c) 


and 


a(t) = a’, a(t) = ble" 


where the a’s, b’s, c’s, and r’s are constant. This manifold is also called the hyper- 


494 R. S. MILLMAN AND ANN K. STEHNEY [May 


bolic upper half-plane. A “‘line’’ (point-set image a(R) of a geodesic «) is either 
a semi-circle centered on the horizontal axis or a vertical straight line (Fig. 3), and 
two points determine a unique line. Notice that if ‘‘parallel’’ means non-intersecting, 
in this geometry there are an infinite number of lines through a given point (a, b) 
and parallel to a given line not containing (a, b). Thus Euclid’s Fifth Postulate is 
not satisfied. 


(a, b) 


(a’, b’) 


3. The Riemann sphere. Let S* possess the metric induced by R*. It can be 
shown that a tangent vector to S? at m is precisely a tangent vector to R? at m which 
is orthogonal to m (as a vector in R*), and therefore a curve « in S* is a geodesic 
in S* only if the Euclidean derivative V,« (as a vector field along « in R*) is everywhere 
perpendicular to S?. It can be shown that geodesics in S? are precisely the constant- 
speed parametrizations of great circles. This manifold is also called spherical space. 
Notice that antipodal points lie on infinitely many distinct ‘‘lines’’ (with the same 
interpretation as in Example 2), while any other pair of points lie on a unique line. 
Since two distinct lines intersect in exactly two points, there are no parallel lines, 
and again Euclid’s Fifth Postulate does not hold. 


4. Geometry as a connection of a fibre bundle. We saw in Section 2 that the geomet- 
ric concept of covariant derivative leads naturally to the equally geometric concept 
of parallel translation of tangent vectors along a curve, i.e., for each curve « in M, 
an isomorphism tq): Ty~o)M — TigayM . We shall now interpret t, as a map from a 
fiber bundle to itself which satisfies certain conditions. 

Let M be an n-manifold. A frame on M is a point m of M (called the origin 
of the frame) together with a basis for T,,M. Let L(M) be the collection of all frames 
on M: 

L(M) = {(m, X,,°+,X,)| mE M and {X;,} is a basis for T,,M}. 

L(M) is called the frame bundle of M. Let 2: L(M)—>M be given by 
m[(m, X 1, +++, X,)] = m. We shall make L(M) into a manifold as follows: Let (U,, @,) 
be a chart for M and let V, = x~*(U,), ie., 


V, = {(m, X1,+--,X,)€L(M)| mev,}. 


For any point m in U,, 


1973] THE GEOMETRY OF CONNECTIONS 495 


7) 
lon 
(where we have suppressed the subscript y by writing x; for u; o @,). Since any two 
bases for a given vector space (in our case, T,,M) differ by a non-singular matrix, 
given any frame (m, X,,--+,X,), there is a matrix A = [a;,| in Gl(n,R) such that 
A carries {0/0x;|,} to {X;}, ie., 


— a | is a basis for T,,M, 
in OXn!m 


0 
X; = >» a,;——| . 

j J Ox; | in 
Writing G for Gi(n,R), we have maps w,:V,-U,xG_ defined by 
w[(m, X,,---,X,)] = (m, A‘). We may put a topology on L(M) which makes each 
Ww, continuous by choosing as a base {w;'(W)| W is open in U, x G}. We now 
define a chart for L(M) to be a pair (V,,@,), where ¢,: V, > R"*”’ is given by 


b,[(m, X14, — X,) | = (x,(m), *#*,X,(M), 041,412, ms Ann) . 


It is easy to see that if V, NV; # @, 


7 y-1 
Ps ° Py (X50 Xqs 4115412577 Ann 
= (05 0 by *(%1,°7*,Xq)> J b5 ody~" 4119s Finds 3 I 65.0b)~ "Ants "5 Ann) 


and so the compatibility condition is satisfied. Here Jj, ,4,-1 is the transpose of 
the nxn Jacobian matrix Jy,,4,-1. With the given topology and charts, L(M) 
is a differentiable manifold of dimension n +n”. L(M) is sometimes called the 


bundle of bases of M. 
We also have an “‘action’’ of G on L(M), specifically the map ®:L(M) x G 


— L(M) given by 
O((m, X1,°°:, X,), A) = (m, p> Aj X jes ys A jnX ;)> 
j j 


where A = [a;;| as usual. We will write b* for (b, A) if b= L(M). Observe that 
for all be L(M), (b4)® = b4® for all A and B in G, and b* = b if and only if A 
is the identity (matrix) of G. ’ 

A curve &: I > L(M)is called a lift of a(«: 1 > M)ifn. & = a. For ben *(a(0)), 
& is called a lift at b if in addition (0) = b. The geometric importance of the frame 
bundle lies in the following theorem about lifts. 


THEOREM 4.1. Assigning a parallel translation t, along each curve « in M 
is equivalent to assigning to each curve « and each ben~*(o(0)), a unique lift 
&, of « at b such that 


&, a(t) = [&,(t)]* 


for all t in the domain of «a. 


496 R. S. MILLMAN AND ANN K. STEHNEY [May 


(The condition above means that the lift of « at b“ is the lift of « at b acted on 
by A at each point. We call this property equivariance. For reasons which will 
appear later, the equivariant lift in the theorem is called the horizontal lift of « 
at b.) 


Proof. Letting b = (a(0), X,,---,X,)¢77 '(a(0)), the correspondence is given 
by 


Tact) ( py c;:X;) = x ci:Y(a(t)) <- 


d,(t) = (a(1), Y,(a(t)), ms Y,(ac(t))) , 


Assuming that each curve « has a unique horizontal lift &, at b, &,(t) has the form 
on the right, where { Y,(a(t))} is a basis for T,,,,M . We then define t, by the expression 
on the left. It is easy to check that +, is independent of the choice of b (precisely 
because of equivariance) and that t, is an isomorphism. 

Conversely, given t, for each curve a, & is defined by the right-hand side. Cer- 
tainly &, is a lift of w to b since t, 9) is the identity on T);9)M. The equivariance 
follows from Theorem 2.4. | 

In this proof, we obtained &, by parallel translation along « of the basis repre- 
sented by b. One may picture this (for 2-manifolds) as in Fig 4a. After superimposing 
the picture of &, on M (Fig. 4b), we see that &, looks very much like a moving frame, 
where the dotted vectors are obtained by parallel translation. 


(a) (b) 
Fic. 4 


Let us recap briefly the situation. We have two manifolds, M and L(M), a Lie 
group G (Gi(n, R)) acting on L(M), and a map z: L(M)—>M. Our theorem says 
that the geometric structure on M of parallel translation is precisely the same as 
unique equivariant path-lifting of curves in M to curves in L(M) with specified 
initial points. This last concept is analogous to that of unique path-lifting to covering 
spaces (with a covering space of M replacing L(M) and the group of covering trans- 


1973] THE GEOMETRY OF CONNECTIONS 497 


formations replacing G). Indeed, the problem of the existence of a lift & is really 
a special case of the “‘lifting problem’’ in topology. The notion and the ability 
to lift curves will be seen to be topological concepts, whereas the uniqueness of the 
lift will be a geometric concept. Why can we lift curves in M to curves in L(M)? 
It is because z~'(U,) is homeomorphic (via y,) to U, x G. If « is any curve in U, 
and h is any function from U, into G, then &t) = W,*(a(t),h © (2) is a lift (not 
necessarily the horizontal lift) of «. It is this ‘‘local product’’ structure together 
with the group action which play a central role in our development of geometry. 
We shall define a principal fiber bundle by imitating the essential features of the 
frame bundle, and be able to put a ‘“‘geometry’’ on it, 

Let P be a manifold. We say that a Lie group G acts freely on P if there is a 
map from Px G into P (we will write (p,g)t p*%) such that for all peP, 
(p’)" = p*" for all g and h in G, and p’ = p for any pe P if and only if g is the 
identity in G. Points p and q in P are called equivalent under (the action of) G if 
q = p* for some geEG. 

A (differentiable) principal fiber bundle G— P - M consists of manifolds P 
(the bundle space or total space) and M (the base space), a Lie group G (the structural 
group), and a C® map z: P > M such that 

(i) G acts freely on P, 

(ii) M is the quotient space of P under the action of G, so that z(p) = x(q) 
if and only if p and q are equivalent under G, and 

(iii) for every U, in some open covering of M, there is a diffeomorphism 
w,: 7 '(U,) + U, x G which commutes with the action of G, i-e., if w,(p) = (x(p), A), 
then y,(p’) = (x(p), hg). 

For me M, the ‘‘fiber’’ 2~1(m) is diffeomorphic to G, by condition (ii). Ob- 
serve that P is locally the product of M and G but that in general, P need not be 
diffeomorphic to M x G. (In particular, L(M) # M x Gl(n,R).) We shall mention 
only three more examples of principal fiber bundles; others are presented, for example, 
in [11]. 


Examples: 1. (The trivial bundle.) Let P = M x G and 7a be the projection 
onto the first factor. Then G acts on P by ((m,g),h) > (m,gh). 
2. Let P be any covering space for M, x: P > M be the covering map, and G 
be the group of deck transformations (with the discrete topology). 
+3. (The canonical C* bundle over CP”".) Let P = C"+* — {0} (complex (n + 1)- 
space minus the origin) and G = C* (the non-zero complex numbers). Complex 
projective n-space CP" is defined as follows: we say that for z, and z, in P, z, ~ z, 
if there is 4¢G such that z, = Az,. Then the set CP" of ~-equivalence classes of P 
is a 2n-manifold and G ~ P > CP" is a principal fiber bundle. 
If p is a point in a bundle space, the set V, = {X ¢T,P| 2,(X) = 0} is called 
the vertical subspace at p. It is clear that a fiber 2~'(m) is a manifold which sits 


498 R. S. MILLMAN AND ANN K. STEHNEY [May 


inside P and whose tangent space at each point p is the vertical subspace V, (Fig. 5), 
for 2,(X) = 0 if and only if Xf depends only on the restriction of f to 2~*(m). 


trl e 
Cou 


Fic. 5 


It is a well-known result in algebraic topology that any path in the base of a 
fiber bundle may be lifted to the total space (because of the local product structure). 
The lifting at a specified initial point is unique if and only if the fiber has no non- 
constant paths (i.e., the fiber is discrete). That lifts are not in general unique is not 
surprising from our viewpoint because, in Riemann’s language, the fiber bundle 
is the ‘‘space’’ and something (a method of unique path-lifting) must be added to 
provide the geometry. The ambiguity of the lift is that the lifted curve could try 
to ‘‘travel up the fiber’’ as well as move ‘“‘horizontally.’’ (Compare with covering 
spaces, where the lift cannot go up.) We shall define a connection on a principal 
fiber bundle, which is nothing more than a direction to go, or many directions not 
to go. 

Let € denote a principal fiber bundle G > P — M and n be the dimension of M. 
A connection H on € is an assignment for each pe P of an n-dimensional sub- 
space H, c T,P, called the horizontal subspace at p, such that for each p, 

G) 7,P =V, @ H,, 

(ii) (R,),(H,) = Hops for all geG, and 

(iii) If h: T,P > H, is projection and X is a vector field on P, then hX is also 
a vector field on P. (This is a differentiability condition on H.) 

We have written R, for the diffeomorphism of P given by pt p’. 


THEOREM 4.2. Giving a connection on € is equivalent to giving unique equi- 
variant path-lifting in €. 


Proof (sketch). Assume that a connection H is given on é and let « bea simple 
curve in M. (For a proof without this assumption on «, see [1], pp. 77-79.) Since 
1X = 0 if and only if X is vertical, we see that the restriction of z,, to any hori- 
zontal subspace H, is one-to-one, hence an isomorphism (by dimensions) of H, 
onto T,M. For each pex™*(a(t)), let X, be the unique horizontal vector at p 


1973] THE GEOMETRY OF CONNECTIONS 499 


which projects onto a(t). These vectors may be extended to a vector field X on P 
which at each point lies in the horizontal subspace. Now for pe x~*(a(0)), let &, 
be the (unique) integral curve of X such that &(0) = p (Fig. 6). The compactness 
of I insures that &, may be defined on all of J, if necessary by piecing together curves 
defined on smal] intervals. The tangent vectors to &, are clearly horizontal and 
@, is a C® curve which projects onto « (this is not clear!), The equivariance follows 


from our construction and the second condition for H. 


Fic. 6 


Now suppose we have unique equivariant path-lifting on €. To define H, for 
peéP, let «,,-*+,«, be curves in M such that «,(0) = m = x(p) and {é,(0)} forms a 
basis for T,,M. (The «,’s may be taken to be integral curves of the vectors in any 
basis for T,,M .) Let &; be the lift of a at p. Define H, to be the span of {é,(0)} (see 
Fig. 7). A little work shows that H, is independent of the choice of the «,’s and that 
p +» H, is a connection. | 


>< 


Fic. 7 


The definition of connection in this setting is due to Ehresmann [3] although 


500 R. J. WILSON [May 


it is implicit in some work of Cartan. The authors feel that even in this setting there 
is the dichotomy of Riemann—the concept of space (the principal fiber bundle) 
and the additional structure which yields the geometry (the connection), and that 
the geometric content of the massive structure of modern differential geometry 
is still apparent. 


References 


1. R. L. Bishop and R. J. Crittenden, Geometry of Manifolds, Academic Press, New York, 
1964. 

2. S. S. Chern, Some new viewpoints in differential geometry in the large, Bull. Amer. Math. 
Soc., 52 (1946) 1-30. 

3. C. Ehresmann, Les connections infinitésimales dans un espace fibré différentiable, Colloque 
de Topologie, Bruxelles (1950) 29-55. 

4. H. Flanders, Differential forms, MAA Studies No. 4 in Global Geometry and Analysis, 
edited by S. S. Chern, 1967. 

5. N. J. Hicks, Notes on Differential Geometry, Van Nostrand, Princeton, N. J., 1965. 

6. S. Kobayashi and K. Nomizu, Foundations of Differential Geometry, Vol. I and II. Inter- 
science, New York, 1963, and 1969, 

7. R.S. Millman, Geodesics in metrical connections, Proc. Amer. Math. Soc., 30 (1971) 551-555. 

8. K. Nomizu, Recent developments in the theory of connections and holonomy groups, Advan- 
ces in Math., 1 (1961) 1-49. 

9. D. E. Smith, A Source Book in Mathematics, Vol. II. Dover, New York, 1959. 

10. M. Spivak, Differential Geometry, Vol. I and II. Published by Michael Spivak, 1970. 

11. N. Steenrod, The Topology of Fiber Bundles, Princeton Univ. Press, 1951. 


AN INTRODUCTION TO MATROID THEORY 
R. J. WILSON, The Open University, England 


1. Introduction. In this expository article, we shall be presenting a survey of 
some of the most important aspects of matroid theory, a branch of combinatorial 
mathematics which has come very much to the fore in the last few years. 

The subject originates from a fundamental paper of Hassler Whitney [35], which 
appeared in 1935. He had just spent several years working in the field of graph 
theory, and had noticed several similarities between the ideas of independence and 
rank in graph theory and those of linear independence and dimension in the study 
of vector spaces. In his paper Whitney used the concept of a matroid to formalize 
these similarities. A matroid is essentially a set with some kind of ‘independence 


Robin Wilson did his undergraduate work at Balliol College, Oxford, and his graduate work at 
the University of Pennsylvania and M.I.T. His Penn. Ph.D. on sieve methods was written under 
N.C. Ankeny. He was a Lecturer at Jesus College, Oxford, and nowis Lecturer at the Open University. 
The present article is derived from his lectures at the Combinatorial Analysis Institute, Bowdoin Col- 
lege. He is the author of Introduction to Graph Taeory (Oliver & Boyd, Edinburgh, 1972). Editor. 


1973] AN INTRODUCTION TO MATROID THEORY 501 


structure’ defined on it; the name ‘matroid’ arose from his consideration of the 
independence of the columns of a matrix. At about the same time, B. L. van der 
Waerden [32] rediscovered the idea of a matroid while trying to formalize the 
definitions of linear and algebraic independence. 

The work of Whitney and van der Waerden was largely ignored for over twenty 
years (with the important exceptions of a paper of S. Maclane [15] in 1936, and one 
of R. Rado [27] in 1942) until a breakthrough occurred in 1958 when W. T. Tutte 
characterized those matroids which arise from graphs (see [31]). Later, in 1965, 
J. Edmonds and D. R. Fulkerson [8] (and independently, L. Mirsky and H. Perfect 
[20] and Brualdi and Scrimger [4]) recognized the importance of matroids in 
transversal theory. Since then, a large number of combinatorialists have contributed 
to the subject, and there is already an impressive literature in the field. 

In this article, no previous knowledge of graph theory or transversal theory is 
assumed. We shall develop the necessary background material for these subjects in 
Sections 2-4, and these sections are written in such a way as to motivate the material 
which follows. In Sections 5-7, several equivalent definitions of a matroid are 
presented, together with a wide variety of examples. After discussing matroid duality 
in some detail, we then show (Sections 10-12) how matroid theory can be used to 
simplify various ideas in graph theory and transversal theory. The article concludes 
with a brief discussion of some recent work in the subject. Our aim throughout is to 
show that matroid theory is far from being ‘generalization for generalization’s sake’; 
on the contrary, it gives us a deeper insight into several problems in transversal 
theory, as well as including among its applications simple proofs of results in graph 
theory which are awkward to prove by more traditional methods. 

The treatment of the subject given here is somewhat similar to that of the last 
chapter of the author’s recent introductory text on graph theory [36]; several proofs 
which have been omitted from this article may be found in this book. We shall be 
interested only in finite matroids (i.c., matroids defined on finite sets), and shall 
always use |E| to denote the number of elements in a set E. The reader who is in- 
terested in matroids defined on infinite sets should refer to Rado [28] or Brualdi 
and Scrimger [4] for further details. Finally, we should like to point out that although 
due credit has been given where possible to those responsible for results cited in 
this paper, several of these results are so firmly embedded in the folk-lore of the 
subject, that to give such due credit has been impossible. We should consequently 
like to apologize in advance to anyone who feels that he (or she) has been overlooked. 


2. Some results in Graph Theory. In this section, we present a brief survey of 
those results of graph theory which we shall need later. The reader is referred to [36] 
for a fuller treatment of the subject. 

A graph G is defined to be a pair (V(G), E(G)), where V(G) is a finite non-empty 
set of elements (called vertices), and E(G) is a finite family of unordered pairs of 
elements of V(G) (called edges). An example of a graph is given in figure 1; in this 


502 R. J. WILSON [May 


example, V(G) is the set {u,v,w,z}, and E(G) consists of the edges {u,v}, {v, v}, 
{v, wh, {v, w}, {u, wand {w, z}. The edge {v, v} is called a loop, and the two edges of 
the form {v, w} are called multiple edges; any graph containing no loops or multiple 
edges is called a simple graph. The edge {u, v} is said to join the vertices u and v, and 
u and v are then said to be incident to this edge. A graph, each of whose vertices 
belongs to V(G) and each of whose edges belongs to E(G), is called a subgraph of G. 
We shall call two graphs G and G’ isomorphic if there is a one-one correspcndence 
between their sets of vertices with the property that the number of edges joining any 
two vertices of G is equal to the number of edges joining the corresponding vertices 
of G’. 


Fic. 1 
If G is a graph, a path in G is a finite sequence of distinct edges of the form 


{Vo; v1}, {v1, V>}, me" {Un—1> Vin} 


A graph G is called connected if, given any two vertices v and w, there is a path 
connecting v and w. Any graph which is not connected may be split up into a finite 
number of connected subgraphs, called components; for example, the graph in 
figure 2 has three components. 


Fic. 2 


A non-empty path in which all the vertices are distinct (except for vg and u,,, 
which are the same) is called a circuit; for example, any loop or any pair of multiple 
edges forms a circuit. A graph in which every circuit contains an even number of 


1973] AN INTRODUCTION TO MATROID THEORY 503 


edges is called a bipartite graph (see figure 3); note that in a bipartite graph, the 
set of vertices can be partitioned into two sets in such a way that each edge joins two 
vertices, one from each set. A cutset of a graph G is a set of edges whose removal 
increases the number of components of G, and which is minimal with respect to 
this property; for example, in figure 1, the edges {u,v} and {u,w} form a cutset. 
The following results can now be easily proved: 


Fic. 3 


(G1) (i) If C, and C, are distinct circuits of a graph G, each containing an 
edge e, then there exists a circuit in C, UC, which does not contain e; similarly, 
(ii) if Cf and C% are distinct cutsets of G, each containing an edge e, then there 
exists a cutset in Cj UC% which does not contain e. 


(G2) If C is a circuit of a graph G and C* is a cutset, then the number of edges 
of G common to C and C* is even. 


A graph which contains no circuits is called a forest, and a connected forest is a 
tree. If G is a connected graph, then a spanning tree T of G is a tree which contains 
every vertex of G and all of whose edges are edges of G (see figure 4). Similarly, if G 
is any graph, we define a spanning forest of G to be a forest obtained by taking a 
spanning tree of each component of G. The following results are now easily proved: 


Fic. 4 


(G3) No spanning forest of G contains another spanning forest as a proper 
subgraph. 


504 R. J. WILSON [May 


(G4) If T, and T, are two spanning forests of G, and e is an edge of T,, then 
there exists an edge f of T, with the property that (T,—{e}) U{f} (the graph 
obtained from T, on replacing e by f) is also a spanning forest of G. 


By repeatedly using these two results, (G3) and (G4), one can easily deduce the 
following: 


(G5) Any two spanning forests of G contain the same number of edges. 


If G contains n vertices and k components, then the number of edges in any 
spanning forest is n — k; this number is called the rank of G and is denoted by 
K(G). The number of edges which must be removed from G to produce a spanning 
forest will be denoted by (G), and is equal to m —n+k, where m denotes the 
number of edges of G. Some of the most important properties of the rank function 
are described in the following proposition: 


(G6) The rank function k satisfies the following properties: (i) for each subgraph 
HofG,OSK(A)S | E(H) |; (ii) if H is a subgraph of K, then K(A) S K(K); (iii) for 
any subgraphs H and K of G, KH UK)+«(H 01K) S K(A) + K(K). 


We conclude this section with two simple results which combine the concepts of 
circuit and cutset with that of a spanning forest: 


(G7) Gi) Every cutset of a graph G has an edge in common with every spanning 
forest; (ii) every circuit of G has an edge in common with the complement of any 
spanning forest (i.e., the graph obtained by removing from G the edges of the 
spanning forest). 


3. Some results on vector spaces and projective spaces. If V is a finite- 
dimensional vector space over a field F, a basis of V is a linearly independent set 
of elements which span V. It is well known that the bases of V satisfy the following 
properties: 


(V1) No basis of V properly contains another basis of V. 


(V2) If B, and B, are bases of V, and v is an element of B,, then there exists an 
element w of B, with the property that (B, — {v}) U {w} is also a basis of V. 


It is now easy to prove the following: 
(V3) Any two bases of V contain the same number of elements. 


As in the previous section, we can obtain a rank function by defining the rank 
r(A) of each subset A of V to be the dimension of the subspace of V spanned by the 
vectors in A. The following proposition is then easily proved: 


(V4) The rank function r satisfies the following properties: (i) for each subset 


1973] AN INTRODUCIION TO MATROID THEORY 505 


A of V,0S7r(A)S | A]; (ii) if A SB, then r(A) S r(B); (iui) for any A and B, 
r(A U B) + r(A OB) S$ r(A) + r(B). 


A result analogous to (V4) holds also for projective spaces. If P is a projective 
space of dimension n, then we define the rank r(A) of a finite set A of elements of P 
to be one more than the dimension of the projective subspace spanned by A. Clearly, 
this rank function also satisfies the properties (i), (ii) and (iii) of (V4). 


4. Some results in transversal theory. In this section, we present a brief in- 
troduction to transversal theory. A fuller treatment of this subject, including proofs 
of the results stated here, may be found in Mirsky’s survey article [19] or in his 
book [18]. 

Let E be a finite set, and Y = (S,,---,S,,) be a family of non-empty subsets of E. 
A transversal (or system of distinct representatives) of Y is a set of m distinct 
elements of E, one chosen from each of the subsets S;; a partial transversal of Y isa 
transversal of some subfamily of . For example, the family Y = (S,,S,,S3) of 
subsets of E = {a,b,c,d}, where S, = {b,c,d}, S, = S, = {a}, has no transversal, 
but its partial transversals are easily seen to be @, {a}, {b}, {c}, {d}, {a,b}, {a,c} 
and {a,d}. Note that the situation can be represented by a bipartite graph in which 
each edge joins one of the S; to one of its elements (see figure 5); a partial transversal 
then corresponds to a set of edges, no two of which have a vertex in common. 


a 


Sy 

b 
S2 

€ 
S; 

d 


Fic. 5 


In 1935, Philip Hall obtained necessary and sufficient conditions for the family S 
to have a transversal: 


(T1) (Hall’s ‘marriage’ theorem): / has a transversal if and only if, for each 
k satisfying 1 Sks |E|, the union of any k of the subsets S, contains at least k 
elements. 


By a straightforward (but somewhat technical) argument, one can also prove the 
following result: 


(T2) Y has a transversal containing a given subset A if and only if i) F has a 
transversal, and (ii) A is a partial transversal of S. 


506 R. J. WILSON [May 


The reason for introducing transversal theory should become clearer after we 
have stated the next two propositions. In the following, a maximal partial transversal 
of / is a partial transversal of S which is not properly contained in any other 
partial transversal; it follows that if Y actually has a transversal, then every maximal 
partial transversal of / is a transversal. 


(T3) No maximal partial transversal of S properly contains another maximal 
partial transversal. 


(14) If T, and T, are maximal partial transversals of S, and x is an element of 
T,, then there exists an element y of T, with the property that (T, — {x}) U {y} is 
a maximal partial transversal of S. 


By repeatedly using these two results, one can easily deduce the following: 


(T5) Any two maximal partial transversals of S contain the same number of 
elements. 


As before, we can define a rank function o on the set of subsets of E, by defining 
o(A) to be the number of elements in the largest partial transversal of S contained 
in A; we can now prove the following: 


(T6) The rank function o satisfies the following properties: (4) for each subset 
A of E, 0So(A)S|A|; (ii) if ASB, then o(A) S o(B); (iii) for any A, BCE, 
o(A U B) + oA OB) S aA) + o(B). 

The reader should note the similarities between (G3), (V1) and (T3), between 
(G4), (V2) and (T4), between (G5), (V3) and (T5), and between (G6), (V4) and (T6). 


5. The definition of a matroid. Motivated by the results of the three previous 
sections, we now give several different definitions of a matroid; proofs of their 
equivalence may be found in Whitney’s original paper [35]. The reader who finds 
this section heavy-going may wish to refer forward to Section 6, where several 
examples are given. 

A matroid M = (E, &) consists of a non-empty finite set E, together with a non- 
empty collection # of subsets of E (called bases) satisfying the following properties: 

(Bi) No base property contains another base. 

(Ai) If B, and B, are bases, and x is an element of B,, then there exists an 
element y of B, with the property that (B, — {x}) U {y} is also a base. 

As in the previous three sections, one can easily deduce from these two properties 
that any two bases of a matroid contain the same number of elements (sometimes 
called the rank of the matroid). Note that this result generalizes (G5), (V3) and (TS). 

If M is a matroid on a set E, we say that a subset A of E is an independent set 
if A is contained in some base of M. It follows that the bases of M are precisely the 
maximal independent sets, and hence that the matroid is completely determined by 
listing its independent sets. It seems reasonable to expect, therefore, that there may 


1973] AN INTRODUCTION TO MATROID THEORY 507 


be a simple definition of a matroid in terms of its independent sets. One such de- 
finition is as follows: 

A matroid M = (E,*%) consists of a non-empty finite set E, together with a non- 
empty collection .% of subsets of E (called independent sets) satisfying the following 
properties: 

(Fi) any subset of an independent set is an independent set; 

(Fii) if I and J are independent sets, and | J | > [I|, then there exists an element 
x belonging to J but not to I with the property that I U {x} is an independent set. 

Using this definition, it is an easy exercise to show that any independent set can 
be extended to a base, and that if A is a subset of E, then any two maximal in- 
dependent subsets of A contain the same number of elements. (The reader should 
interpret these results in terms of graphs, vector spaces, and transversals.) We can 
also use independent sets to formulate the definition of the isomorphism of matroids: 
two matroids M, and M, are isomorphic if there is a one-one correspondence 
between their underlying sets E, and E,, with the property that a set of elements of 
E, is independent in M, if and only if the corresponding set of elements of E, is 
independent in M,. 

Our third definition of a matroid is very similar to Whitney’s original definition; 
this is given in terms of a rank function and generalizes the results (G6), (V4) and 
(T6) above. 

If M =(E,.4) is a matroid defined in terms of its independent sets, then the 
rank p(A) of a subset A of E is defined to be the number of elements in the largest 
independent set contained in A. It follows that a subset A of E is independent if and 
only if p(A) = | A |, and that if B is a base, then | B| = p(B) = p(E). Since a matroid 
is completely determined by its rank function p, we can redefine a matroid in terms 
of it, as follows: 

A matroid M =(E, p) consists of a non-empty finite set E, together with an 
integer-valued function p (called its rank function) which is defined on the set of 
subsets of E and which satisfies the following properties: 

(pi) for each subset A of E,0S p(A)S | A ; 

(pii) if AS BCE, then p(A) S p(B); 

(pili) for any A,B E, p(AU B) + p(A OB) S p(A) + p(B). 

For future convenience, we define a loop of M to be an element x of E such that 
p({x}) = 0, and a pair of parallel elements of M to be a pair x,y of elements of E 
which are not loops, and for which p({x, y}) = 1. A matroid which contains no loops 
of pairs of parallel elements is called a simple matroid. 

The connection between matroid theory and graph theory may be seen by defining 
a matroid in terms of its circuits. We shall call a subset of E dependent if it is not 
independent, and a minimal dependent set will then be called a circuit. Since the 
circuits of a matroid determine the independent sets, we can redefine a matroid 
in terms of its circuits as follows: 


508 R. J. WILSON [May 


A matroid M = (E,@) consists of a non-empty finite set E, together with a col- 
lection @ of non-empty subsets of E (called circuits) satisfying the following 
properties: 

(Ci) no circuit properly contains another circuit; 

(ii) if C, and C, are distinct circuits each containing an element x, then there 
exists a circuit in C, U C, which does not contain x. 

(Note that this definition gives the extension to matroids of property (G1) (i).) 

We conclude this section by defining a matroid on a set E in terms of a closure 
operation on the set A(E) of subsets of E. If M = (E, p) is a matroid on E defined in 
terms of its rank function p, then the closure (or span) c(A) of a subset A of E is 
defined to be the set of all those elements x of E which depend on 4A; i.e., c(A) 
= {xe E: p(A vu {x}) = p(A)}. We can now redefine M in terms of c as follows: 

A matroid M = (E,c) consists of a non-empty finite set E, together with a function 
c: P(E) > A(E), satisfying the following properties. 

(ci) for each subset A of E, A S c(A) = c(c(A)); 

(cii) if AS BCE, then c(A) < c(B); 

(cili) if xec(A U {y}), x €c(A), then yec(Avu {x}). 

The first two of these conditions express the fact that c is a closure operation on E, 
and the third says that c satisfies what is usually known as the exchange condition, 
which may be expressed informally by saying that if x depends on A and y, but not 
on A alone, then y depends on A and x. We leave it to the reader to verify that a 
subset A of E is independent in Mif and only if noelement of A depends on the others, 
i.e., if and only if x¢éc(A — {x}) for each xe A. 

A matroid M defined in terms of its closure operation is sometimes called a 
pregeometry; if in addition M is simple, then M is said to be a geometry. The reader 
who wishes to pursue matroid theory from this point of view should refer to Crapo 
and Rota’s book [5] for a fuller treatment. 


6. Some examples of matroids. To help the reader assimilate the various 
definitions given in the previous section, we now discuss several important types of 
matroid. 


(1) UNIFORM MatTroipbs. If E is a non-empty finite set, the k-uniform matroid 
on E is obtained by taking as bases all those subsets of E which contain exactly k 
elements. It follows immediately from this that the independent sets are precisely 
those subsets of E which contain not more than k elements, and that the rank of any 
subset,A of E is either | A] or k, whichever is the smaller. Of particular importance 
are the 0-uniform matroid on E (the trivial matroid) whose only independent set is 
the empty set and whose rank function is identically zero, and the | E|-uniform 
matroid (the discrete, or free, matroid) in which every subset of E is independent and 
in which the rank of any subset of E is its cardinality. Note that the discrete matroid 
on E has only one base (namely E itself) and no circuits, and that the closure of any 
set A is A itself. 


1973] AN INTRODUCTION TO MATROID THEORY 509 


(2) GRAPHIC Matroips. As we indicated in Section 2, we can define a matroid 
on the set of edges of a graph G by taking as bases of the matroid the edges of the 
various spanning forests of G. We shall call this matroid the circuit matroid of G, 
and denote it by M(G). It follows that a set of edges of G is independent if and only 
if it contains no circuit of G, and that the circuits of the matroid M(G) are precisely 
the circuits of G. Note also that (by (G6)) the rank function of M(G) is simply x, 
and that if G is a simple graph, then M(G) is a simple matroid. 

We shall call a matroid M-graphic if M is isomorphic to the circuit matroid of 
some graph G. An example of a graphic matroid is obtained by letting E = {a, b,c}, 
and taking as independent sets @, {a}, {b}, {c}, {a,b} and {a,c} (see figure 6); an 
example of a nongraphic matroid is the 2-uniform matroid on a set of four elements, 
as the reader will discover if he tries to draw the graph. Graphic matroids have been 
characterized in terms of ‘excluded minors’ in an important series of papers by 
Tutte (see [31]); we shall present his characterization in Section 10. 


b 
——c > 
¢ 
Fic. 6 


(3) CoGRaPHic MatTrois. The circuit matroid M(G) is not the only interesting 
matroid which can be defined on the set E of edges of a graph G. In view of (G1) (i), 
’ we can define a matroid in E by taking as its circuits the cutsets of G. This matroid 
is called the cutset matroid of G and is denoted by M*(G). Note that a set of edges of 
G is independent on M*(G) if and only if it contains no cutset of G. 

We shall call a matroid M cographic if M is isomorphic to the cutset matroid of 
some graph G. The circuit matroids of the graphs K, and K,,, (depicted in figure 7) 
are examples of matroids which are not cographic; another example is given by the 
non-graphic matroid described above. We shall see later (in Sections 9,10) that the 
matroids M(G) and M*(G) are dual matroids (in a sense to be made precise), and 


Fic. 7 


510 R. J. WILSON [May 


that the concept of the planarity of a graph can be extended in a natural way to 
matroids. 


(4) REPRESENTABLE MATROIDs. Let E be a finite set of vectors in some vector 
space V over a field F. We can define a matroid M on E by taking as independent 
sets of the matroid those subsets of E which are linearly independent in V. The bases 
of M are then precisely those subsets of E which span the same subspace as E (so 
that if the elements of E span V, then every base of M is a basis of V); note that the 
rank of any subset A of E is simply the dimension of the subspace of V spanned by A, 
and that the closure of A consists of all those elements of this subspace which lie in E. 

We can similarly obtain a matroid from any finite set E of elements of a projective 
space over F; in this case, the rank of any subset A of E is simply its rank r(A), as 
defined in Section 3. The maximal subsets of rank one, two, three and r(E) — 1 are 
then called points, lines, planes, and hyperplanes respectively. (Note that if A 1s 
such a maximal set, then c(A) = A.) 

We shall say that a matroid M on a set E is linear over F if M is isomorphic to 
a matroid obtained in the above way from a finite set of vectors in some vector space V 
over F. This definition, however, automatically excludes any matroid which contains 
too many loops or parallel elements, since each loop must corespond to the zero 
element of V, and parallel elements correspond to dependent vectors (leading to 
possible trouble if F is finite). 

It is therefore convenient to define a matroid M to be representable over a field F 
(or, simply, representable) if there exists a (not necessarily one-one) rank-preserving 
mapping from E to the underlying set of some matroid which is linear over F. Note 
that this is equivalent to saying that the simple matroid M’ obtained from M by 
removing all loops and identifying parallel elements is linear over F, or, in fact, 
that M’ is isomorphic to a matroid defined in some projective space over F. It is 
also possible to define representability over a division ring, but we shall not discuss 
this here. 

It turns out that some matroids are representable over every field (the so-called 
regular matroids), some over no fields (see Section 11), and some over only a re- 
stricted class of fields. Of special interest are the binary matroids which are rep- 
resentable over the field with two elements. It is not difficult to show that every 
graphic matroid is binary, and consequently we shall sometimes have to restrict 
ourselves to binary matroids when trying to extend properties of graphs to matroids. 


(5) ALGEBRAIC MatTroips. Let F bea field, and K be an extension of F. If E isa 
finite set of elements of K, a subset A of E is called dependent if the elements of A 
are algebraically dependent over F, i.e., if they satisfy a polynomial equation with 
coefficients in F. It can be proved that these dependent sets form the dependent sets 
of a matroid on E. Any matroid isomorphic to a matroid obtained in this way is 
called an algebraic matroid. At the time of writing, no one has yet proved the 
existence of matroids which are not algebraic. 


1973] AN INTRODUCTION TO MATROID THEORY 511 


(6) TRANSVERSAL MatTroips. If E is a non-empty finite set and S = (S,,---,S,,) 
is a family of non-empty subsets of E, then it follows from (T3), (T4) and (T6) that 
the partial transversals of Y may be taken as the independent sets of a matroid 
M(/) on E. The bases of M(.%) are then the maximal partial transversals of Y, and 
the rank function is the function o defined in Section 4. 

We shall call a matroid M transversal if M can be obtained in the above way (for 
a suitable choice of E and #). For example, the circuit matroid of the graph shown 
in figure 6 is a transversal matroid on {a,b,c}, since its independent sets are the 
partial transversals of the family Y =(S,,S,), where S, = {a}, and S, = {b,c}; 
an example of a non-transversal matroid will be given later in this section. Note 
that if M is any k-uniform matroid on a set E, then M is a transversal matroid, 
since its independent sets are the partial transversals of the family S = (E,-::-, E) 
containing k E’s. 

Transversal matroids have been characterized by Mason [16] and others, but 
there is as yet no known characterization in terms of excluded minors. 


(7) Gammoips. If E and Y are two disjoint sets of vertices in a directed graph* D, 
then we can define a matroid on E by taking as independent sets all those subsets A 
of E with the property that there exist | A| directed paths, no two of which have any 
vertices in common, from A to a subset of Y (see figure 8). Any matroid obtained in 
this way is called a gammoid. Clearly gammoids may also be defined in terms of 
(undirected) graphs, although it is not known whether all gammoids can be obtained 
in this way. Note that a transversal matroid may be regarded as a gammoid in 
which the underlying directed graph is bipartite. 


E 


7 


(8) THE FANO Matroip. Of particular importance in matroid theory is the 
matroid F on the set E = {a,b,---,g}, in which the bases are all those subsets of E 
which contain exactly three elements, except {a,b,c}, {a,d,e}, {a,f,g}, {b,d,g}, 
{b,e,f}, {c,d,f} and {c,e,g}. F is usually called the Fano matroid, and may be 
represented diagrammatically as in figure 9, where the bases are precisely those 


Fic. 8 


* A directed graph (or digraph) is defined in the same way as a graph, except that the ed- 
ges are now ordered pairs of vertices (v, w). A directed path is then a finite sequence of dis- 
tinct edges of the form (Uo, U1), (U1, V2), °**s(U;,— 19 Um) 


512 R. J. WILSON [May 


subsets of three elements which do not lie on a line. It can be proved that F is non- 
graphic, non-cographic and non-transversal; it can also be shown (see Section 11) 
that F is representable over any field of characteristic two, but not over any other 
fields. 


Fic. 9 


(9) Unions oF Matroips. If M, = (E,, p,) and M,=(E,, p,) are two matroids, 
there are various ways of defining their union. For example, if E, and E, are disjoint, 
then their disjoint union is the matroid on E, UE, whose independent sets are 
obtained by taking the union of an independent set in M, and an independent set 
in M,; note that the resulting rank function is then simply p, + p,. As we shall see 
in Section 12, a similar construction may be used when E, = E,, although in this 
case the rank function is rather more complicated. 


(10) FURTHER EXAMPLES. In addition to the matroids already described, there are 
several other important types of matroid which we cannot discuss here. Among 
these are the matroids derived from simplicial complexes, and those associated with 
the partitions of a set. The reader will find both of these examples discussed in Crapo 
and Rota’s book [5]. 


7. Matroids and lattices. There has recently been a lot of interest in the in- 
vestigation of matroids from a lattice-theoretic point of view. In this section, we shall 
indicate how this connection between matroids and lattices arises; a fuller discussion 
will be found in Crapo and Rota [5]. 

If M is a matroid on a set E, and c is the closure operation on M described in 
Section 5, then we define a subset A of E to be a closed set (or subspace) if c(A) = A. 
For example, if M is a simple matroid, then the empty set and all subsets of E 
containing only one element are closed. It follows easily from properties (ci) and (cil) 
that if A and B are closed subsets of E, then so is A OB. It is not in general true, 
however, that their union A U B is necessarily closed, but we can assert that c(A U B) 
is always closed. 

It is not difficult to prove from these remarks that if M is a matroid on a set E, 


1973] AN INTRODUCTION TO MATROID THEORY 513 


then the closed subsets of E form a lattice LCM) under set inclusion, in which the 
meet A A B of two closed sets A and Bis equal to A OB, and their join A v B is 
equal to c(A UB). For example, the lattice corresponding to the 3-uniform matroid 
on the set E = {a,b,c,d} is as shown in figure 10. 


abcd 


It turns out that whatever matroid M we start with, the resulting lattice L(M) is 
always a semimodular lattice. This means that the lattice has a height (1.e., rank) 
function r such that r(A) measures the length of a maximal chain from the lowest 
element c(@) to the element A of the lattice, and the function r satisfies the inequality 
r(A Vv B)+r(A a B) Sr(A) + r(B), where A and B are any two elements of the 
lattice. Moreover, it can be shown that every element in the lattice can be expressed 
as the join of subsets of rank one. Such a lattice is usually called a geometric lattice 
(see Birkhoff [1]), and the elements of rank one, two, three and r(E) — 1 are called 
points, lines, planes, and hyperplanes, respectively. The reader should check that 
in the case of a linear matroid, these definitions agree with those given in the previous 
section. Note also that if M is a simple matroid, then the points of L(M) are precisely 
the one-element subsets of E. 

We have seen that if we are given a matroid M on a set E, then we can define a 
geometric lattice L(M) whose elements are the closed subsets of E. It can also be 
shown that if L is any geometric lattice, then we can obtain a simple matroid M(L) 
defined on the set E of points of L, by defining a subset A of E to be independent in 
M(L) if and only if the join of the points in A has (lattice-) rank equal to |A|. It is 
not difficult to see that the resulting matroid M(L) is the same as the simple matroid 
which gives rise to L in the manner described above. (The reader should verify this 
in the case of the lattice L of figure 10.) 

It follows from this that there is a one-one correspondence between simple 
matroids and geometric lattices, and hence that matroid theory may be regarded 
essentially as the study of geometric lattices. 


8. Duality in graph theory. In this section we shall describe various ways of 


514 R. J. WILSON [May 


defining the dual of a graph. The idea of this is to motivate some of the results on 
matroid duality to be presented in the next section. 

A graph G is called a planar graph if it can be embedded in the plane without 
crossings (in other words, the lines representing two edges of G are allowed to 
intersect only at a point of the plane which corresponds to a vertex to which they are 
both incident); any such embedding is then called a plane graph. (Strictly speaking, 
this should be called a ‘plane embedding of a planar graph’, but the abbreviation 
‘plane graph’ is now standard.) The regions into which the edges of a plane graph 
divide the plane are called faces. For example, both of the graphs of figure 11 are 
planar, but only the second one is a plane graph; note that the second graph has 
four faces. 


Se 


LN 


Fic. 11 


If G is a plane graph, we form its geometric-dual G* as follows: we take as the 
vertices of G* one point inside each face of G, and then, for each edge e of G (adjoining 
faces f, and f,, say), we draw a corresponding edge e* of G* which crosses e (but 
no other edge of G) and joins those vertices of G* which lie inside f, and f,. This 
procedure is illustrated in figure 12, with small crosses and dashed lines denoting 
the vertices and edges of G*. It is a simple matter to check that if G is a connected 
plane graph, then its double-dual G** is isomorphic to G. 


Fic. 12 


If G is a planar graph, then a geometric-dual G* of G may be defined by taking 
any plane embedding of G, and forming the geometric-dual in the manner described 


1973] AN INTRODUCTION TO MATROID THEORY 515 


above. It follows from this that a planar graph may have several different geometric- 
duals; all of them, however have certain properties in common—for example, the 
following two results always hold ({36| Section 15): 


(G8) A set of edges of G forms a circuit in G if and only if the corresponding 
set of edges of G* forms a cutset in G*. 


(G9) If G is a planar graph, then G is bipartite if and only if G* is Eulerian 
(i.e., the set of edges of G* can be partitioned into disjoint circuits of G*). 


It is obvious that a graph G is planar if and only if G has a geometric-dual, since 
we have not defined geometric-duality for non-planar graphs. What we should like 
to do is to find a definition of duality which generalizes the geometric-dual of a 
planar graph, but which also tells us whether or not a giver graph is planar. One 
such definition, arising out of (G8), is as follows: a graph G* is called an abstract- 
dual of a graph G if there is a one-one correspondence between the edges of G and 
those of G* with the property that a set of edges of G forms a circuit of G if and only 
if the corresponding set of edges of G* forms a cutset of G*. The following results 
now hold (see [36] Section 15, and Parsons [24]): 


(G10) If G* is an abstract-dual of G, then G is an abstract-dual of G*. 
(G11) A graph is planar if and only if it has an abstract-dual. 


An alternative definition of duality (which can be shown to be equivalent to the 
previous one) was given by Whitney. In this definition, H* denotes the graph obtained 
from G* by removing the edges of H*, and x and y are defined as in Section 2. We 
now define G* to be a Whitney-dual of Gif there is a one-one correspondence between 
the edges of G and those of G* with the property that 


y(H) + K(H*) = x(G*), 


for any subgraph H of G whose vertex-set V(H) is equal to the vertex-set of G. The 
following results can then be proved (see [36] Section 16): 


(G12) If G* is a Whitney-dual of G, then G is a Whitney-dual of G*. 
(G13) A graph is planar if and only if it has a Whitney-dual. 


‘ Although the definitions of abstract-duality and Whitney-duality may seem at 
first sight rather strange, they will turn out to be direct consequences of the definition 
of matroid duality. 


9. Duality in matroids. Our aim in this section is to show how one can define 
matroid duality insuch a way that the circuit and cutset matroids of a graph are duals 
of each other. The resulting definition will include the three definitions of duality 


516 R. J. WILSON [May 


given in the previous section, and will also enable us to explain the similarity in graph 
theory between the properties of circuits and those of cutsets. 

If M is a matroid on a set E, we define the dual matroid M* to be the matroid 
on E whose bases are precisely the complements of the bases of M; in other words, 
B* is a base of M* if and only if E — B* is a base of M. It is not difficult to check 
that this does in fact define a matroid, and that its rank function p* is given by 


p*(A) =|A| + p(E — A) — p(E), 


where A CG E, and p is the rank function of M. 

It follows immediately from this definition that (in contrast to the duality of 
planar graphs) every matroid has a dual, and that this dual is unique. Moreover, it 
is Clear that the double-dual M** is equal to M. It can also be proved without too 
much difficulty (see [36] Section 32) that if G is a graph, then the circuits of the 
dual matroid of M(G) are precisely the cutsets of G, and hence that the dual of 
M(G) is simply M*(G). 

To see where this is all leading, let us define some ‘co-notation’. If M* is the dual 
of M, we define a cocircuit of M to be a circuit of M* (so that, for example, the 
cocircuits of the circuit matroid M(G) of a graph G are simply the cutsets of G) 
Similarly, we define a cobase of M to be a base of M*, the corank of M to be the 
rank of M*, and so on. Analogously, we say that a matroid is cographic if its dual 
is graphic, and in view of the remarks made above, this definition agrees with the 
one given in Section 6. (As an exercise, the reader may like to check that the 
cocircuits of M are precisely the complements of the hyperplanes of M.) The reason 
for introducing these extra definitions is that we need now deal only with the matroid 
M, instead of dealing with both M and M*. 

To illustrate this, let us first explain the similarity between the properties of 
circuits and the properties of cutsets in a graph (as illustrated by (G1), (G2) and 
(G7) of Section 2). The reason for this is simply that any result on the circuits of a 
matroid immediately gives us two results about graphs, since we can apply our 
matroid result either to the circuit matroid M(G) of a graph G (giving us a result on 
the circuits of G), or to the cutset matroid M*(G) of G (giving us the corresponding 
result for the cutsets of G). 

As an example of this, let us consider property (@ii) of Section 5. If we apply 
this to M(G) we immediately deduce the result of (G1) (i); however, if we apply it 
to M*(G), then we obtain the result of (G1) (ii). In other words, the two results of 
(G1) are simply dual forms of a single result, and so, instead of having to prove two 
separate results in graph theory, we need prove only one result in matroid theory, 
and then use duality. 

Another example is given by the results of (G7). It is easy to prove that in any 
matroid, every cocircuit intersects every base (since if C* and B are a cocircuit and 
a base of M which are disjoint, then C* is a circuit of M* contained in a base E — B 


1973] AN INTRODUCTION TO MATROID THEORY 517 


of M*, giving the required contradiction). On applying this result to the circuit 
matroid M(G) of G, we immediately obtain (G7) (1); on applying it to the cutset 
matroid M*(G), we immediately obtain (G7) (ii). Note that both of these results may 
also be deduced from the dual of the above matroid result, namely that in any 
matroid, every circuit intersects every cobase. 


It is probably worth pointing out at this stage that the result of (G2) does not 
generalize completely to arbitrary matroids. In fact, if C is any circuit and C* is any 
cocircuit in a matroid M, then all we can say in general is that the number of elements 
common to C and C* is not equal to one. However, if M is a binary matroid (as is 
the case with the circuit matroid of a graph), then it turns out that the number of 
elements common to C and C™ is even, generalizing (G2). Since the converse of this 
is also true, it follows that the dual of a binary matroid is binary. 


We have just seen how matroid duality can be used to give us greater insight into 
problems involving the circuits and cutsets of a graph. We now show briefly how 
the results of Section 8 fit in with the definition of matroid duality. 


We note first that the matroid definition of duality generalizes the definition of 
an abstract-dual of a graph G; this is clear since the circuits of M(G) correspond to 
the cocircuits of M*(G) and hence the circuits of G correspond to the cutsets of G*. 
Since the abstract-dual of a planar graph is equivalent to the Whitney-dual and 
generalizes the geometric-dual, it follows that these other two definitions of duality 
are also consequences of the matroid definition of duality. Note that although a 
planar graph may have several different geometric-duals, they all give rise to the 
same matroid. Note also that the rather artificial-looking equation in the definition 
of the Whitney-dual is simply a restatement of the expression for p* given at the 
beginning of this section, and that the equation M** = M is the natural matroid 
generalization of properties (G10) and (G12). 

We conclude this section by remarking that (G9) has a natural extension to binary 
matroids which takes the following form: if M is a binary matroid, then M is 
bipartite if and only if M* is Eulerian; in this statement, a matroid is called 
bipartite if every circuit contains an even number of elements, and is called Eulerian 
if E can be expressed as the union of disjoint circuits. The reader is refered to Welsh 
[33] for a proof of this result. 


10. Graphic Matroids. In this section we shall present Tutte’s fundamental 
result on the characterization of graphic matroids [31]. We shall find it convenient, 
however, to start with a few remarks about planar graphs. 


If G is any graph, then we can obtain new graphs from G by any succession of 
the following two operations: 

(i) deleting one or more of its edges; 

(ii) contracting one or more of its edges, i.e., removing an edge e = {v,w} and 
identifying the vertices v and w in such a way that all edges which were formerly 


518 R. J. WILSON [May 


incident to either v or w are now incident to the new vertex (see figure 13). 


Fic. 13 


In 1930, Kuratowski [14] obtained a characterization of planar graphs which 
has been expressed by Harary and Tutte [9] in the following form (see [36] Section 
12): 


KURATOWSKI’S THEOREM. A connected graph is planar if and only if it cannot 
be reduced to either Ks or K3. (see figure 7) by any succession of the above 
operations. 


The operations of deletion and contraction of edges in a graph have analogues 
in matroid theory. If M is a matroid on a set E, and A is a subset of E, then the 
deletion matroid (or restriction matroid) M x A is the matroid on A whose circuits 
are precisely those circuits of M which are contained in A; similarly, the contraction 
matroid M - A is the matroid on A whose cocircuits are precisely those cocircuits of 
M which are contained in A. We leave it to the reader to verify that if M is the 
circuit matroid of a graph G, then these matroids correspond to the graphs obtained 
by the operations described above. Any matroid which is obtained from M by a 
succession of deletions and contractions is called a minor of M. 

It is a simple matter to show that if a matroid M is binary, graphic or cographic, 
then any minor of M has the same property. It follows from Section 6 that if M is a 
graphic matroid, then M is binary and contains no minor which is isomorphic to 
M*(K;), M*(K;.3), F or F*, where F denotes the Fano matroid. 

In 1958, Tutte proved in an important series of papers that these conditions were 
not only necessary conditions for a matroid to be graphic, but were also sufficient, 
thus characterizing graphic matroids. He also proved the very deep result that a 
binary matroid M is regular (see Section 6(4)) if and only if it contains no minor 
isomorphic to F or F*. We can thus state Tutte’s result in the following form: 


TUTTE’S THEOREM. A matroid M is graphic if and only if it is regular and 
contains no minor isomorphic to M*(Ks) or M*(K; 3). 


On applying Tutte’s theorem to M*, and using the fact that the dual of a regular 
matroid is regular, we immediately obtain a characterization of cographic matroids: 


1973] AN INTRODUCTION TO MATROID THEORY 519 


THEOREM. A matroid M is cographic if and only if it is regular and contains no 
minor isomorphic to M(Ks) or M(K3, 3). 


Since a graph is planar if and only if it has a dual, it seems reasonable to define a 
matroid M to be planar if both M and its dual M* correspond to graphs, in other 
words if M is both graphic and cographic. In view of the above remarks, we can now 
give a characterization of planar matroids which corresponds to Kuratowski’s 
characterization of planar graphs: 


THEOREM. A matroid is planar if and only if it is regular and contains no 
minor isomorphic to M(Ks), M(K;,3) or their duals. 


11. Representable matroids. In this section we discuss briefly the representa- 
bility of matroids from a more geometrical point of view. In what follows, every 
matroid M will be assumed simple; it follows from this that if M is representable, 
then M must be isomorphic to a matroid defined in some projective space over 
some field, and hence that the configurations which arise must not contradict well- 
known results in projective geometry. 

To illustrate this, we first consider Pappus’ theorem which states that if in figure 
14, the points {a,b,c} and the points {d,e,f} are collinear, then so are the points 
{g,h, i}. It follows that if we take M to be the matroid on {a, b,---,i} whose bases 
are all those subsets containing three elements except 


{a,b,c}, {a,e,g}, {a,f,h}, {b,d,g}, {b.f,i}, {c,d,h}, {c,e,i} and {d,e,f}, 


then M cannot be representable over any field, since if it were, then by Pappus’ 
theorem the set {g, h, i} would also have to be a dependent set, contradicting the fact 
that {g,h,i} is a base of M. 


Fic. 14 


A similar situation holds with Desargues’ theorem, which states that if in figure 
15 the triangles bcd and efg are in perspective from the point a (i.e., the points 
{a, b, e} are collinear, as are the points {a,c,f} and {a,d,g}), then the points {h, i,j} 
are also collinear. It follows as before that if M is the matroid on {a,---,j} whose 


520 R. J. WILSON [May 


bases are all those subsets containing three elements except 
{a, b, e}, {a, c,f}, {a, d, g}; {b, C, h}, {b, d, i}, {c, d, j}, {e,f, h}, {e, J; i} 
and {f,9,J}, 


then M cannot be representable over any field. 


Fic. 15 


Before leaving the subject of representability, we return briefly to the Fano 
matroid F which corresponds (see figure 9) to the seven-point projective plane. To 
investigate the representability of F, we assign plane projective coordinates to each 
point, choosing (as we may) 


a= (1, 0, 0), c= (0, I, 0), g = (0, 0,1) and d= (1, I, 1); 


it follows immediately that b = (1, 1,0), e = (0, 1,1) and f= (1,0, 1). But these three 
points are collinear if and only if the determinant formed by their coordinates is 
zero, i.e., if and only if 2 = 0. It follows that the Fano matroid is representable only 
over fields of characteristic two. It also follows from this discussion that if F’ is the 
matroid which corresponds to figure 9 with the line joining the points {b,e,f} 
removed, then F’ is representable over all fields except those of characteristic two. 
Note that the disjoint union of F and F’ is not representable over any field. 

For further results of this kind, and a more complete discussion of representability 
in general, the reader is referred to MacLane [15], or to the survey article by Ingleton 


[12]. 


12.’ Transversal matroids. Up to now, we have been concerned primarily with 
the relationship between matroid theory and graph theory. We now indicate briefly 
how matroid theory can be used to prove results in transversal theory. This subject 
is dealt with in far greater depth in Mirsky’s book [18], and the reader is referred 
there for further details. 


We recall that if S = (S,,---,S,,) is a family of non-empty subsets of a finite set 


1973] AN INTRODUCTION TO MATROID THEORY 521 


E, then the partial transversals of Y form the independent sets of a matroid M(.7) 
on E. Using this fact we can give a very elementary matroid proof of property (T2) 
(of Section 4). It is clear that if Y has a transversal containing A, then the properties 
(i) and (ii) of (T2) hold; conversely if (i) and (ii) hold, then by (ii), A is an independent 
set in M(.) and hence can be extended to a base of M(%). The result now follows 
from (i), since every base of M (.) is a transversal of /, proving the result. 

There are several other proofs in transversal theory which can be simplified 
using matroids, and one may ask whether there is a generalization of Hall’s theorem 
(see Section 4) which gives necessary and sufficient conditons for the existence of an 
independent transversal of a family Y of subsets of a finite set E with a matroid 
M defined on it. Such a generalization was given by R. Rado [27] in 1942: 


RADO’s THEOREM. Let E be a finite set, M a matroid on E, and S = (S,,---,S,,) 
be a family of non-empty subsets of E. Then S has a transversal which is indepen- 
dent in M if and only if, for each k satisfying 1 sks |E|, the union of any k of the 
subsets S;, has rank at least k. 


Note that if M is the discrete matroid on E, then Rado’s theorem reduces to 
Hall’s theorem. 

Rado’s theorem has a wide variety of applications in transversal theory. One of 
the most well-known of these (see [36] Sections 27, 33) is the problem of finding 
necessary and sufficient conditions for two families Y and 7 of subsets of a given 
set E to share a common transversal. This problem is of some importance in the 
study of timetabling as may be seen by taking, for example, the underlying set E to 
be the set of hours when mathematics lectures can be given, the family S to consist 
of the sets of hours when each of the professors is able to lecture, and the family 7 
to consist of the hours when each of the classrooms is available. A common transversal 
then gives us a way of assigning professors to rooms at times when both are available. 


Rado’s theorem has also been used (by Welsh [34]) to obtain results on the 
union of matroids. If M,,-:-,M, are matroids on the same set E, with rank functions 
P1>°*'s Px respectively, then we can define a new matroid M, U---UM, on E by 
taking as independent sets all possible unions of the form A, U--: U A,, where A, is 
independent in M; for each i. That this actually defines a matroid seems to have been 
first pointed out by Nash-Williams [23], and its rank function pf is given by 


p(A) = min {o(X) + + p(X) +|A-X]}, ASE). 


In particular, if M, =---=M, (=M,say), then the rank function simplifies to 
p(A) = min {kp(X) +|A— X]}, 


where p is the rank function of M. 
This last result has been used by Edmonds [7], and others, to give simple proofs 


522 R. J. WILSON [May 


of several deep results in graph theory, transversal theory, and vector spaces. For 
example, if M isthe circuit matroid of a graph G, then G contains k edge-disjoint 
spanning forests if and only if M contains k disjoint bases, i.e., if and only if the 
union of k copies of M has rank at least kp(E). This can be restated as follows: 


THEOREM. A graph G contains k edge-disjoint spanning forests if and only if, 
for each subgraph H of G, k(k(G) — k(A)) S m(G) — M(A), where m(A) denotes 
the number of edges in H. 


In a similar way, the above result can be used to give a necessary and sufficient 
condition for G to split into at most k forests. In this case, the rank of the union of k 
copies of M is simply |E , and the result takes the following form: 


THEOREM. A graph G can be split up into at most k forests if and only if, for 
each subgraph H of G, k- xK(H) 2 m(A). 


Although they are easy deductions from our results on the union of matroids, 
both of these theorems are very difficult to prove by straightforward graph-theoretic 
techniques (see the papers of Tutte [30] and Nash-Williams [21], [22]). 

We conclude this section by stating a theorem of Horn [11] on vector spaces, 
which can be deduced in exactly the same way as the second of the above theorems. 


THEOREM. If E is a finite set of vectors in a vector space, then E can be divided 
into k disjoint linearly independent subsets if and only if, for each subset A of E, 
| A| < k-rank A, where the rank of A is as defined in Section 3. 


13. Some recent results. In this final section, we shall discuss a few results in 
matroid theory which have been proved in the last two or three years. 


(1) THE ENUMERATION OF MatTrolps. It is easy to see that the number f(n) of 
non-isomorphic matroids on a set E of n elements cannot exceed 22", since E has 
2” subsets, and each of these may be dependent or independent. What is surprising 
is that this seemingly rather crude upper bound turns out to be very sharp, since 
Piff and Welsh [26] have proved that for any value of 4 <1, f(n) is bounded below 
by 22°", if n is sufficiently large. In fact, they managed to obtain the following sharper 
bounds for n sufficiently large: 


2° < f(n) < 2°, 


where aw = 2"/n? and B = 2"n?. 

If we now restrict ourselves to transversal matroids, and let t(n) denote the 
number of non-isomorphic transversal matroids on a set E of n elements, then each 
of the subsets of E in the family Y can be chosen in 2" ways, and hence t(n) is bounded 
above by (2")"=2". Stricter bounds have been obtained by Heron and Piff 


1973} AN INTRODUCTION TO MATROID THEORY 523 
(unpublished) who have proved that 
2%” < 1(n) < 2’. 


Little is known about the number of non-isomorphic graphic or representable 
matroids. 


(2) CONNECTIONS BETWEEN VARIOUS TYPES OF MATROIDS. It has recently been 
shown by Piff and Welsh [25] that if M, and M, are matroids on the same set, both 
of which are representable over a field F, then their union M, UM, is also rep- 
resentable over F, provided that F has sufficiently large cardinality. It follows, using 
the easily-proved fact that any transversal matroid can be expressed as the union of 
matroids of rank one, that every transversal matroid is representable over all 
sufficiently large fields (and, in particular, over every infinite field). Since every 
k-uniform matroid is transversal, it follows from this that every k-uniform matroid 
is representable over all sufficiently large fields. 

It has also been proved recently by de Sousa and Welsh [6] that a transversal 
matroid is binary if and only if it is graphic. Related to this is the result of Bondy 
[2] that the circuit matroid M(G) of a graph G is transversal if and only if G contains 
no subgraph homeomorphic to K, or C? for some n. (In this statement, K, denotes 
either of the graphs shown in figure 11, and C? is the graph obtained by ‘doubling 
up’ the edges of an n-gon (see Figure 16); the theorem then states that no subgraph 
of G can be obtained by inserting new vertices into the edges of either K, or C2) 


Fic. 16 


(3) PRESENTATIONS OF TRANSVERSAL MatTroips. If M is a transversal matroid on 
a set E, whose independent sets are the partial transversals of a family # of subsets 
of E, then S is called a presentation of M. It is not difficult to show that if M has 
rank r, then there exists a presentation of M which contains only r subsets. More- 
over, it has been shown by Bondy and Welsh [3] that these subsets can all be taken 
to be cocircuits of M. 


524 R. J. WILSON [May 


(4) RESULTS ON GAMMOIDS. It is not difficult to show that the dual of a trans- 
versal matroid is not necessarily transversal, and it is therefore a worthwhile question 
to ask what the duals of transversal matroids look like. Since every transversal 
matroid is a gammoid, one can also ask what the duals of gammoids look like. 
This problem has been solved recently by Mason [17] and Ingleton and Piff [13], 
who have shown that the dual of a gammoid is always a gammoid, and hence that 
the dual of a transversal matroid is also a gammoid. Moreover, Ingleton and Piff 
have shown that there are some important gammoids (called ‘strict gammoids’) 
which have certain natural properties, and which turn out to be precisely the duals 
of transversal matroids. 


Ingleton and Piff also showed that a matroid M is a gammoid if and only if M 
is a contraction of a transversal matroid. By imitating the argument of Piff and 
Welsh, they were able to prove that every gammoid is representable over all suf- 
ficiently large fields. 


(5) THE CRITICAL Pros_em. If M is a matroid on a set E, then the critical problem 
as posed by Crapo and Rota [5] is the problem of determining the minimum number 
k of hyperplanes H; of M, such that H, N--- NH, = @. 


By duality, this problem is equivalent to the problem of finding the minimum 
number k of circuits of M whose union is E, although the problem loses much of its 
geometrical significance when expressed in this way. Recently the critical problem 
has been solved for quite a number of matroids, although there is some way to go 
before any results of great significance are obtained. The importance of the critical 
problem stems mainly from the fact that several of the famous unsolved problems in 
the coloring of graphs (including the celebrated four-color conjecture) may be shown 
to be special cases of the critical problem. Crapo and Rota have expressed the hope 
that by developing techniques for solving the critical problem in a few simple cases, 
one may eventually find a suitable approach for tackling these coloring problems. 


References 


1. G. Birkhoff, Lattice Theory, 3rd ed. Amer. Math. Soc. Colloq. Publ., 25 (1967). 

2. J. A. Bondy, Transversal matroids, base-orderable matroids, and graphs, Quart. J. Math. 
Oxford 23 (1972) 81-89. 

3. J. A. Bondy and D. J. A. Welsh, Some results on transversal matroids and constructions for 
identically self-dual matroids, Quart. J. Math. Oxford Ser., 22 (1971) 435-451. 

4. R. A. Brualdi and E. B. Scrimger, Exchange systems, matchings and transversals. J. Combi- 
natorial Theory, 5(1968) 244-257. 

5. H. H. Crapo, and G. -C. Rota, Combinatorial Geometries, M.I.T. Press, 1971. 

6. J.de Sousa and D. J.A Welsh, A characterisation of binary transversal structures, (to appear). 

7. J. Edmonds, Minimum partition of a matroid into independent subsets, J. Res. Nat. Bur. 
Standards, 69 B, (1965) 67-72. 

8. J. Edmonds and D. R. Fulkerson, Transversals and matroid partition, J. Res. Nat. Bur. 
Standards 69 B, (1965) 147-153. 


1973] AN INTRODUCTION TO MATROID THEORY 525 


9. F. Harary and W. T. Tutte, A dual form of Kuratowski’s theorem, Canad. Math. Bull., 8 
(1965) 17-20, 373. 

10. F. Harary and D. J. A. Welsh, Matroids versus graphs, in The Many Facets of Graph Theory, 
Springer Lecture Notes 110, 1969. 

11. A. Horn, A characterization of unions of linearly independent sets, J. London Math. Soc., 
30 (1955) 494-496. 

12. A. W. Ingleton, Representation of matroids, in Combinatorial Mathematics and its Applica- 
tions, Academic Press, New York, 1971. 

13. A. W. Ingleton and M. J. Piff, Gammoids and transversal matroids, (to appear). 

14, K. Kuratowski, Sur le probléme des courbes gauches en topologie, Fund. Math., 15 (1930) 
271-283. 

15. S. MacLane, Some interpretations of abstract linear dependence in terms of projective 
geometry, Amer. J. Math., 58 (1936) 236-240. 

16. J. H. Mason, A characterization of transversal independence spaces, in ThZorie des Matroides, 
Springer Lecture Notes 211, 1971. 

17. , Ona class of matroids arising from paths in graphs, Proc. London Math. Soc., 25 
(1972) 55-74. 

18. L. Mirsky, Transversal Theory, Academic Press, New York, 1970. 

19, , Transversal theory and the study of abstract independence, J. Math. Anal. Appl., 25 
(1969) 209-217. 

20. L. Mirsky and H. Perfect, Applications of the notion of independence to problems of com- 
binatorial analysis. J. Combinatorial Theory, 2 (1967) 327-357. 

21. C. St. J. A. Nash-Williams, Edge-disjoint spanning trees of finite graphs, J. London Math. 
Soc., 36 (1961) 445-450. 


22. , Decomposition of finite graphs into forests, J. London Math. Soc., 39 (1964) 12. 
23. , An application of matroids to graph theory, Proc. Symp. Rome, Dunod (1966) 
263-265. 


24. T. D. Parsons, On planar graphs, this MONTHLY, 78 (1971) 176-178. 

25. M. J. Piff and D. J. A. Welsh, On the vector representation of matroids, J. London Math. 
Soc., 2 (1970) 284-288. 

26. and 
3 (1971) 55-56. 

27. R. Rado, A theorem on independence relations, Quart. J. Math. (Oxford) 13 (1942)8 3-89. 

28. , Axiomatic treatment of rank in infinite sets, Canad. J. Math., 1 (1949) 337-343. 

29. W. T. Tutte, Introduction to the Theory of Matroids, American Elsevier, New York, 1971. 

30. , On the problem of decomposing a graph into 1 connected factors, J. London 
Math. Soc., 36 (1961) 221-230. 

31. , Lectures on matroids, J. Res. Nat. Bur. Stand., 69B (1965) 1-47. 

32. B. L. van der Waerden, Moderne Algebra, 2nd. ed., Springer, Berlin, 1937. 

33. D. J. A. Welsh, Euler and bipartite matroids, J. Combinatorial Theory, 6 (1969) 375~377. 

34. , On matroid theorems of Edmonds and Rado, J. London Math. Soc., 2 (1970) 
251-256. 

35. H. Whitney, On the abstract properties of linear dependence, Amer. J. Math., 57 (1935) 
509-533. 

36. R. J. Wilson, Introduction to Graph Theory, Oliver & Boyd, (Edinburgh) and Academic 
Press (New York) 1972. 


, On the number of combinatorial geometries, Bull. London Math. Soc., 


STOCHASTIC EQUATIONS AND THEIR APPLICATIONS 
G. C. PAPANICOLAOU, Courant Institute, New York University 


1. Introduction. Almost all problems in physics, engineering, economics, biology, 
and other sciences to which mathematical methods are applicable are basically 
stochastic rather than deterministic. It is more appropriate to attempt to determine 
the probabilities with which the phenomena under investigation occur rather than 
the precise phenomenon which occurs. Nevertheless, the majority of mathematical 
methods are based on deterministic models. This is a reasonable first approximation 
which frequently renders the problem mathematically tractable. It is also quite 
adequate for many problems. However, the errors due to stochastic effects may 
accumulate in prolonged observations of a phenomenon so that the deterministic 
analysis becomes useless. It is necessary therefore to have a rational method for 
taking into consideration stochastic effects. 

The problem of accumulation of error in random phenomena has been central 
in the development of probability theory. Until recently, however, investigations 
have been limited to simple situations which exclude several problems of interest. 
Let us consider some examples. 

Suppose we wish to measure a physical quantity such as temperature at a fixed 
time and location. The measuring process is subject to errors which are due to many 
individually negligible causes. Thus by invoking the classical Central Limit Theorem 
[4] we conclude that the error in measurement is a Gaussian random variable with 
zero mean and variance chosen to fit available data or estimated in some other way. 
This is the well-known theory of errors in its simplest form. 

Let us also consider the following problem. Let u(t) represent a physical quantity 
as a function of time and suppose that u(t) satisfies, for example, the differential 
equation 


u(t) + a(t)u(t) + b(t)u(t) =9, uO) = uo, 
u(0) = uy, where f = a 


The functions a(t) and b(t) represent properties of the dynamical system determining 
the evolution of u(t). Let us assume that a(t) and b(t) are random functions. Then 
u(t) is a random function defined by a random or stochastic ordinary differential 
equation. If a(t) and b(t) fluctuate little from their expected values which we denote 
by E{a(t)}, E{b(t)}, then, as a first approximation we may solve the deterministic 
problem: 


George Papanicolaou received his Ph. D. at the Courant Institute in 1969 under J. B. Keller. Since 
then he has held positions at the Univ. Heights Campus of N. Y. U. and at the Courant Institute. 
His main research is in the subject of this article. Editor. 


526 


STOCHASTIC EQUATIONS AND THEIR APPLICATIONS 527 


W(t) + Efa(t)}w(t) + E{b(D} w(t) =0, w(0) = uo, w(0) = ty. 


The question arises: how good is this approximation? In particular, how does w(t) 
compare with E{u(t)}? For short times we do not expect much discrepancy. This is 
the justification for considering deterministic problems. But at large times, E{u(t)} 
may deviate very significantly from w(t). A basic problem in the study of stochastic 
equations, and the one we shall consider here, is to find effective methods for 
computing the statistical characteristics of the solution given the statistical charac- 
teristics of the coefficients and of the initial or boundary conditions. 

Let us observe that the simple theory of errors described above is not adequate 
for the study of stochastic equations. The methods of modern probability theory can 
be used, however, to obtain a number of interesting results on the behavior of 
solutions of stochastic equations. We shall present some of these results here. 

In section 2, we give a few examples of physical problems that lead to stochastic 
ordinary differential equations. We limit ourselves mainly to stochastic ordinary 
differential equations because they are the simplest examples. Relatively little 
progress has been achieved for more general problems, such as stochastic partial 
differential equations, despite the fact that many physical problems lead naturally 
to such equations. | 

In section 3, we formulate the basic problems associated with stochastic ordinary 
differential equations, in the language of the theory of probability. 

In section 4 we deal with a special class of problems: equations whose coefficients 
are Markov processes. We include here some applications, which are of independent 
interest, to singular perturbation of deterministic partial differential equations. 

The basic limit theorem of Stratonovich [25] and Has’hminskii [9] is presented 
in section 5. This is a far reaching generalization of the classical Central Limit 
Theorem and gives satisfactory results for many problems. This result was introduced 
in [26] and was applied extensively there. It was also obtained in a different manner 
in [14,15, 21]. An operator theoretic proof is given in [22]. 

In section 6 we apply the theorem of section 5 to two examples. 


2. Examples of problems leading to stochastic differential equations. Our first 
example is the random harmonic oscillator. Let u(t) denote the displacement of a 
particle of mass m from its equilibrium position and let a linear spring with spring 
constant k connect the particle with a fixed support. Then u(t) satisfies the equation 
of motion 


(2.1) mi(t) + ku(t)=f(t), u(O)=u,, uO) = a. 


Here f(t) is an external force acting on the particle and uy, %, are the initial displace- 
ment and velocity. If f(t) is a random function of time then u(t) will be a random 
function of time. (We delay a precise mathematical statement of this problem and 
the others considered in this section until section 3.) Similarly, ug and uw may be 


528 G. C. PAPANICOLAOU [May 


random variables. Under any circumstances the determination of the statistical 
characteristic of u(t) is relatively simple since (2.1) is solved explicitly by 


k mi. fk 
u(t) = ugcos Jivt * [eosin | 
m [('f(s).. [ke 
+ eae sin [eC s) ds. 


Thus, for example, E{u(t)}, the expected value of u(t), can be determined by taking 
expected values on both sides of (2.2). Other statistical characteristics can be obtained 
in the same direct manner but computations may become lengthy. 

Let us consider briefly the mean or expected energy associated with the oscillator. 
Let us assume that 


(2.3) E{uo} = E{tio} = E(f(0} = 0, 


that uy, uo, f(t) are independent, and that f(t) is a stationary random process with 
covariance 


(2.4) E{f(Of(s)} = o?R(t — 5). 


We also assume that 


(2.2) 


(2.5) l= [ R(a) do < 0. 
0 
The quantity J has the dimension of time and is called the correlation time of f(t). 


Now upon multiplying (2.1) by u(t), integrating from 0 to t, and taking expected 
values, we obtain 


(2.6) E > u*(t) + u(t) \- ES tg + 3 = [ Bas} ds. 


The quantity 4 ma? + 4ku? is the instantaneous energy of the oscillator. To compute 
the change in time of the mean energy from (2.6) we compute the right-hand side 
using our hypotheses (2.3)-(2.5) and (2.2): 


o prs k 
J Ro 005 |r - ded 


o* f' k 
< | =o) R(o) 0s bn 74 


[ Beso} a 
(2.7) 


Thus from (2.5), (2.7) and (2.6) we conclude that when t is large, the mean energy of 
the oscillator increases linearly with t. 


1973] STOCHASTIC EQUATIONS AND THEIR APPLICATIONS 529 


This conclusion should be contrasted with the deterministic case when f(t) is 
periodic. Here the energy is bounded if the period of f is different from 27 Jk /m 
and increases like t* when f is periodic with this period (resonance). 

Problem (2.1) is uncharacteristically simple because its solution is known explicitly. 
A more interesting problem arises when either the ‘‘constants’’ of the oscillator m 
and k are random functions, or when the spring is nonlinear, or when both situations 
arise. Let us suppose that the spring ‘‘constant’’ k(t) is a random function so that 
we have 


(2.8) mii(t) + k(t)u(t)=0, u(0)= uo, (0) = tho. 


If we replace k(t) by its expected value in (2.8), this amounts to taking expected 
values and then assuming that 


(2.9) E{K(t)u(t)} = E{k(t)} E{u(t)}. 
But we have no way of knowing whether (2.9) is valid or not since u(t) is itself a 
functional of k(c) for ¢ €[0,t]. In fact (2.9) is false in general. 

Let us suppose that k(t) is a stationary process so that 


(2.10) E{k()} =k, >0, E{(k(t) — ky) (k(s) — ky)} = o? R(t — 5), 


l= [ R(a) do < o. 
0 


The correlation time | is representative of the time length over which the values of 
k(t) are significantly correlated. There is another natural time scale associated with 
(2.8) namely the period 


(2.11) v=2n Tez 
ko 


of oscillation when k(t) is replaced by ky. We now distinguish the following three 
cases for special consideration, when the variance of the inhomogeneities a? is small. 

(i) v <I. In this case replacing k(t) by its mean value ky will not have appreciable 
effect over many periods of oscillation. The motion will be approximated well by the 
deterministic equation. 

(ii) v > 1. In this case we expect that stochastic effects are quite important at 
times of the order of magnitude of several periods. In section 5, we shall see that in 
this case a generalized Central Limit Theorem is valid. 

(iii) v comparable to J. In this case the precise nature of the process k(t) will be 
important. If we assume that k(t) is Markovian then sometimes it may be possible 
to obtain information about u(t). Equations with Markovian coefficients are discussed 
in section 4. 

Let us now suppose that the oscillator is nonlinear and is acted upon by a random 
force so that 


530 G. C. PAPANICOLAOU [May 


(2.12) mi+V'(u(t)) =f, u(0)=uo, wO)=%, V' = —. 


Here V(u) is the potential energy function of the spring and is assumed to be convex 
upwards in the neighborhood of u = 0. Let us also assume that f(t) is stationary 
with mean zero and (2.4), (2.5) hold. We may again consider the three regimes 
encountered above. Now we may take for v the quantity 22 ./m/V’(0) provided V 
is a smooth function near u = 0. 

Another example of interest is the following. Let u(t, x) denote the transverse 
displacement of a taut string occupying the interval [0,L]. Then wu satisfies the 
equation: 


pu, = Fu,,, u(t,0)=coswt, u(t, L) = 0, 


u(0,x)=f(x), —u,(0, x) = g(x). 


We have assumed that the left end point of the string is being subjected to oscillations 
of frequency w and we have denoted by p = p(x) > 0 the mass density of the string 
and by F the tensile force. Let us suppose that p(x) is a random function of position 
with mean p, > Oand sufficiently small fluctuation. If we ignore the transient behavior 
of the string and examine only the steady state solution 


u(t,x) = Real part of [e’“v(x)], 


(2.13) 


then v(x) satisfies the boundary value problem 
w* 
(2.14) Vx, + Fr P(e =0, vo(0)=1, v(L)=0. 


Thus v(x) is a random function of position. Problem (2.14) is more difficult than 
(2.8) because it is a boundary value problem. We may again consider the three cases 
we introduced following (2.11). But for case (ji) and case (iii) no results analogous 
to the ones for initial value problems are known. 

Stochastic eigenvalue problems also arise frequently. For example, let u(t, x) be 
again as above but now suppose both end points are kept fixed. Then we have 


pu,, = Fu,,, u(t,0) = u(t, L) = 0, 
u(0, x) = f(x), u,(0, x) = g(x). 


‘t(x) then we are led to the eigenvalue 


(2.15) 


If we look for time harmonic solutions e 
problem 


(2.16) Vyyx = —Ap(x)v, v(0)=v(L)=0, A= o 


Since p(x) > 0 is a random function, 4 is a random variable, and the corresponding 
solution v(x) of (2.16) is a random function. Problems of this kind are also dif- 


1973] STOCHASTIC EQUATIONS AND THEIR APPLICATIONS 531 


ficult to treat. Some results have been obtained in [1]. We shall not consider this 
problem here. 

Finally we consider wave propagation in a one dimensional random medium. 
Let u(t, x) represent the wave field at location x at time t. We shall assume that the 
index of refraction n(x) is identically 1 outside the interval [0,L] and equal to a 
random function inside [0,L]. Then u(t, x) satisfies the problem 


Cc 2 
(2.17) Un — (=) Ux, = 9, 
n(x)=1, x<0, x>0; n(ix)=1+ ur), OSXSL, 
u(O,x)=f(x), 40, x) = g(x), 
u(t,x), u,(t,x) continuous at x = 0 and x= L. 


We assume that the random function p(x), which has mean zero, is bounded in 
absolute value by a constant less than one, so that the above problem is well posed. 
Let us restrict our considerations to time harmonic fields of frequency w by assuming 
that 


(2.18) u(t, x) = el) 4 Rete) Ye 0 k=a/e, 
(2.19) u(t, x) = Tel **) x>L. 


Here T and R are complex random variables and are called the transmission and 
reflection coefficient respectively. In view of (2.18), (2.19), problem (2.17) reduces to 
the following problem for v(x) = & u(t, x): 


Vx, + k*n?(x)v = 0, OSxEL 

__ pikx —ikx 
(2.20) v(x)=e" + Re, x <0, 
v(x) = Te’™, x >L, 


v,v, continuous at x =O and x= L. 


The physical significance of (2.18) and (2.19) is that, under steady state conditions, 
a plane wave e““*~™) is incident upon the interval of inhomogeneities [0, L] which 
produces a reflected wave R et) x <0, and a transmitted wave T 7), 
x > L. The problem we consider is to determine the statistical characteristics of Tand 
R given the statistical characteristics of n(x) in 0 < x S L. In section 6 we shall give 
some results on this problem when case (ii) (see above) prevails. Note that here the 
space variable x plays the role of time, v = 27/k is the wave length of the incident - 
wave, and | is the correlation length of the inhomogeneities p(x), (E{u(x)} = 0, 


E{u(x)u(x’)} = 0? R(x — x’)). 


3. Formulation of the mathematical problem. The description of the problems we 


532 G. C. PAPANICOLAOU [May 


have given in section 2 and in the introduction has been somewhat loose. In this 
section we wish to state these problems in correct mathematical language. We shall 
do this for a general class of stochastic ordinary differential equations but we shall 
restrict ourselves to initial value problems. The other problems can be formulated 
similarly. 

Let (Q, ¥, P) be a probability space, that is, O is an abstract set, ¥ a o-algebra 
of subsets of Q and P a probability measure on ¥. Let x(t, w), a EQ, be a measurable 
mapping of Q x [0, 00) to R™ (on Q x [0, 00) we take the product measure P x p, 
y= Lebesgue measure). Let F(z,x,t) be a mapping of R" x R™ x [0, 00) into R" 
such that it is differentiable in z and Lebesgue measurable in t and x. The differential 
equation 


(3.1) TO) _ F(e(t,@), x(t,@),1),  2(0,@) = 29 


has a unique solution z(t,w) in the sense of Carathéodory [3]. The solution is a 
measurable function of w. The measurability with respect to w follows from con- 
siderations analogous to those yielding continuous dependence on parameters. 

If the paths x(t,@) are almost surely continuous and F(z, x, t) is continuous as a 
function of x and t, then the paths z(t, @) will be almost surely differentiable and will 
satisfy (3.1). The initial condition z(0,@) =z may be a random variable without 
additional complication. Similarly, if x(t,@) is a pure jump process with no in- 
stantaneous states and F(z,x,t) is continuous as a function of x and ft, then z(t, w) 
will have continuous paths which are differentiable between jumps of x(t, w). 

Thus no question of ‘‘existence and uniqueness’’ arises for the stochastic dif- 
ferential equation (3.1) which requires considerations different from the deterministic 
situation. However, such questions do arise when the process x(t,@) has unusually 
rough paths, as for example when x(t,q@) is white noise, i.e., the ‘‘derivative’’ of the 
Brownian motion process. Since the Brownian paths are almost surely non-dif- 
ferentiable an equation such as (3.1) must then be interpreted in an appropriate 
manner. One useful interpretation leads to the subject of It6 stochastic equations 
[16]. Under this interpretation the ensuing process z(t,@) is a diffusion Markov 
process and the representation (3.1) can be exploited in the investigation of their 
properties [16]. 

From the point of view of applications, however, this interpretation is frequently 
inappropriate. The simplest reason why this is so is that the process is not associated 
with (3.1) in an invariant way. A change of coordinates in (3.1) leads to a different 
diffusion process (not the same process in the new coordinate representation). An 
interpretation which overcomes this difficulty, but which is not as convenient mathe- 
matically as It6’s, has been proposed by Stratonovich [25]. This point is also 
discussed in [27]. 

We shall always assume here that x(t,@) has well-behaved paths so that no 
problem of interpretation arises. The main objective is to find effective methods for 


1973] STOCHASTIC EQUATIONS AND THEIR APPLICATIONS 533 


obtaining the statistical properties of z(t,@), such as E{z(t,w)}, etc., given those of 
x(t, @). As is customary, we shall not write the @ €Q explicitly in the sequel. 


4. Equations with Markov coefficients. Let us consider the stochastic differential 
equation 


dz(t) 


TEGO, O), 2(0) =z. 


(4.1) 


Here F(z, x) is a mapping of R" x R™ into R", differentiable in z and continuous in x, 
and x(t) is an R™ valued Markov process. We first consider the case where x(t) is a 
time homogeneous diffusion process whose infinitessimal mean and variance are 


given by 


(4.2) a(x) = lim 5 EL + At) — x(t) ] [x(t + At) — x,()] | x(t) = x} 
At—+0 


(4.3) bx) = lim x Eli + At) — x(t) | x(t) = x} i,j =1,--+,m. 
Atzo Al 


From the above assumptions it follows that (z(t), x(t)) constitutes an R"*™ 
valued diffusion Markov process. To compute its infinitessimal mean and variance 
we use (4.1) and obtain, in addition to (4.2) and (4.3) 


lim Efe + At) — z,(t)| z(t), x(t)} = F,(z(t), x(t)), i= 1,2,---,n, 
Aro Al 


im Qo B {Cale + Ad — 2A] x(t + Ad — x/(0] | 2(0,x(0} = 0, 
i= 1,2,---,n, j =1,2,-+ mM, 


. 1 
dim Fp La + Ad — 2] [2,(¢ +9 — 2,(9]| 20), xO} = 0, 
i=1,--n, jol,--yn. 
Let f(z,x) be a bounded smooth function on R**+™ and define u(t, z, x) by 
u(t,z,x) = E,,.{f (z(t), x(t))}. 


Here E,,,.{ } denotes expectation with respect to the measure on the paths of the 
process (z(t), x(t)) starting from (z, x) ¢R"*+™. Then u(t, z, x) satisfies the backward 
Kolmogorov equation [7] 


07u il Ou 

ai(xX)s-z— + 2b b,x) — 

(4.4) 1 7 Ox,0xy © say 7% Ox; 

F (z, x) — ; u(0, z,x) =f(z, x). 
dz; 


534 G. C. PAPANICOLAOU [May 


Similarly, it can be shown that the transition probability density of (z(t), x(t), 
P(t, Z,X} Zo,Xo), When it exists, satisfies the forward equation [7] 


F(ayooP) — & OP) 


(4.5) — ie (F ;P), 
j=l 


P(0,2Z,X3 Zo,Xo) = O(X — Xg)5(Z — Zo). 


The coefficients a;;, b; and F,; are assumed to be sufficiently smooth. 

The problem of determining the statistical characteristics of z(t) defined by (4.1) 
is equivalent to solving (4.4) or (4.5) when x(t) is a given diffusion process. For most 
problems of interest this is an untenable objective and approximations must be 
sought. We shall do this systematically in section 5 in a somewhat broader context. 
We shall explore here briefly the connections between stochastic processes and 
differential equations, a subject of independent interest. We shall also consider a 
simple example, for which (4.4) is not solvable explicitly but a time independent 
solution can be obtained. 

Let x(t) be a one dimensional diffusion process with diffusion coefficients a(x) 
and drift coefficient b(x). Let z(t) be the scalar valued process defined by 


(4.6) oe = v(x(t))z(t), z(0) = z. 


Here v(x) is a bounded smooth function on R. Clearly the solution of (4.6) is 


t 
(4.7) z(t) = zexp| | v(x(s))ds}, 
0 
From the above considerations it follows that 
(4.8) u(t, z,x) = E,,.{f (z(t), x(t))} 
satisfies the problem 
7) 7) 
>= 5a(x) = are x) a ou + o(x)2 


(4.9) 
u(0, z,x) = f(z, x). 


Consider the special initial function 
f(Z, x) = 29(x), 


with g(x) smooth. Assuming all integrals exist we define V(x, t) by 


1973] STOCHASTIC EQUATIONS AND THEIR APPLICATIONS 535 


t 
(4.10) V(x, t) = E,, exp| [ o(x(o)oas} a(x(0)] . 
0 
But from the definition of u it follows that 
(4.11) u(t, z,x) = z V(x, ft). 
Thus (4.11) and (4.9) yield 
(4.12) cae = 54(x) cr a + b(x » 4 — + v(x) V, V(x, 0) = g(x). 


The representation (4.10) of the solution to (4.12) is called the Feynman-Kac 
formula [9]. The derivation we gave here is similar to that of Frisch [5]. Some 
examples of how this representation is exploited can be found in [6,12]. 

Next we consider the case where x(t) in (4.1) is not a diffusion process but a pure 
jump Markov process. For simplicity we take x(t) to be a Markov chain with in- 
finitesimal matrix Q. 


t=0° 
Here e® is the transition probability matrix of the chain. We assume that the chain 
has no instantaneous states 

— qii< ®, i=1,2,--- 


and is conservative 
—Gi= LX Qj. 
ji 


Again (z(t), x(t)) are jointly a Markov process with state space R" x {1,2,---}. Since 
x(t) is discrete valued we shall denote functional dependence on x(t) by a subscript. 
Thus, as above, the functions 


(4.13) u(t, Z) = Ey, fea(Z)}, 


with f(z), i = 1,2, ---, smooth functions on R’, satisfy the system of partial differential 
equations 


8 


on a Q,,u,; + z F,(z, x; ae 
(4.14) i Zt 
u,(0, Z) = f;(z). 


Let us consider the special case where x(t) is the random telegraph process, i.e., 
the two state chain with values +1 and —1 and 


(4.15) o-(; i} a>0. 


536 G. C. PAPANICOLAOU [May 


Let us further define F to be scalar valued, so that z(t) is scalar valued, and set 
(4.16) F(z,41)= 21. 
Then from (4.15) and (4.16) it follows that for this example (4.14) becomes 


Ou 4 Ou, 
a = —du, tau_ t+ az? 
(4.17) 
Cu Ou _ 
a = au, —-du_ — Aa” u (0,2) = f(z). 


In (4.17) we have denoted by + the two states of the chain instead of numerical 
subscripts. Now from (4.16) and (4.1) it follows that 


z(t)=Zz + [x ds, 
) 
and from (4.13) 
(4.18) u(t,z)=E., {fa (: + [ x(s) is)| . 


This result is a probabilistic representation of the solution of the hyperbolic 
system (4.17) in the form of an expectation over paths of the random telegraph 
process. It was first obtained by Kac [11]. However, the connection of this representa- 
tion to the Feynman-Kac formula (4.10) and (4.12) was not noticed there. The 
probabilistic representation of the solution of hyperbolic systems is given in [8, 23]. 

As an example of the usefulness of the representation (4.18) we shall employ it to 
obtain information about properties of the solutions of (4.17). Wedo this in somewhat 
greater generality. Let x(t) be an N-state ergodic Markov chain with values x,, 
X,°°*,;Xy, and with a unique invariant probability vector (P,, P,,-:-, Py) so that 


i= 


N 
(4.19) » Px; — 0. 
1 
We assume that z(t) is scaJar valued and 
(4.20) F(z, x;) = X; i=1,---,N. 
Then the functions u,(t, z) = Ei{ fay (z + [ox(s) ds)} satisfy the differential equations 


N 
(4.21) Oui = > QO; jt; + XxX 
ot jel 


Ou; 
62’ 


u,(0, Z) — F(z). 


We wish to study the behavior of u,(t,z) when t is large. For this purpose, we 
introduce a small parameter ¢ which we allow to tend to zero as t > oo. In this way 
we obtain a nontrivial limit. We set 


t/e2 
(4.22) u(t, z) = E, {Face (- +6 i) x(s) is)| , T= 6". 
) 


1973] STOCHASTIC EQUATIONS AND THEIR APPLICATIONS 537 


Thus uS® satisfies the system 


dus? 1 


2 . 
(4.23) OT 8" jt 


u(0,z) = f(z). 
We note that when we let « > 0 formally in (4.23), 0 < t S tj fixed, it reduces to an 


algebraic problem and thus it is a “‘singular’’ perturbation. It is not difficult however 
to find the limit of uS(z, z) by using (4.22). Note that 


uS(t, z) = E; {Fae (= + ia; [ x6 is)| , 


Now (1/,/t) {x(s) ds converges weakly as t-> 00 to a Gaussian random variable 
with mean zero and variance 


= lim — -f [ E{x(s)x(s’)} ds ds’. 


t— 00 
This is the Central Limit Theorem for Markov Chains [17]. Moreover 
lim P{x(t /e*) = x,} = P, 


e->0 
O<tSto 


independently of the initial conditions on x(t). This follows from the ergodic theorem 
for Markov chains [17]. These two observations imply that when f,(z), i= 1,---,N, 
are bounded continuous functions, then 


ge $7/20? 
lim u$?(t,z) = [- [ = Pif(z + /t po}. dt. 
O<ret0 7 /2n0? 


We summarize the above as follows. Under the hypothesis stated, the solution 
of (4.23) converges as e> 0, 0< 1X ty fixed, uniformly in z to u (t, z) which is 
independent of i and satisfies the problem 


du® = 1 07u® 
—eee Co 
Or 2 O22’ 


N 
u°(0,z)= & P,f,(z). 
i=1 


The use of differential equations methods to prove limit theorems goes back to 
Khinchine [13]. It was used by Pinsky [23] to investigate (4.23) and obtain limit 
theorems for Markov chains with error estimates. Here we have used the limit 
theorems to obtain a result on the singular perturbation of (4.21) (in the form (4.23)) 
as was done in [8]. 

The relation between limit theorems for Markov processes and singular pertur- 
bation of differential equations has received considerable attention recently. The 
article of Pinsky [24] gives a good survey of the subject. 

We close this section with an example where a time independent solution of (4.4) 


538 G. C. PAPANICOLAOU [May 


can be obtained. More complicated examples and some remarks on the existence of 
time independent (or equilibrium) solutions for diffusion equations can be found in 


[26]. 


Consider the nonlinear oscillator 


2 
(4.24) a1 +k?z,+ a?z? = N(t), 2,(0)=2Z,, 


dz 
—_— — Z4. 


d 


0) 
t 


Here a and k are real constants and N(t) is the white noise process, that is, the formal 
derivative of the Brownian motion process. No question of interpretation for (4.24) 
arises (see Sec. 3) because we may simply integrate (4.24) once and consider the 
resulting equation as defining z,(t). Let 


z,(t) — 20) 


Because of the properties of Brownian motion (it has independent increments) the 
process (z,(t),z,(t)) is a diffusion Markov process with backward Kolmogorov 
equation [7] 


(4.25) 
u(0, Z 45 Z2) = f (24, Z2). 


Let us define the function A(z,,z,) by 


i 


1 1 
5 kez + —a°zi + =z. 


(24, Z2) = 4 9) 


This is the total energy, or Hamiltonian, of the oscillator. We note that 


u(Z1,2Z) =ce ~ E4172) 

is a time independent solution of (4.25). The constant c may be chosen so that 
[of U(zZ1, 22) dz, dz, = 1. Then u(z,,z2) is an invariant or equilibrium density 
for the process (z,(t), z2(t)). It is called the Maxwell-Gibbs distribution of the oscil- 
lator (4.24). The remarkable fact is that although for (4.24) we cannot solve the 
deterministic nonlinear problem, we have a simple description of the equilibrium 
behavior of the stochastic problem. 


5. The Theorem of Stratonovich and Has’minskii. In the previous section, we 
saw that when the coefficients of the stochastic equation are Markovian it is still 
practically impossible to solve the Kolmogorov equations. Sometimes it may be 
possible to obtain time independent solutions. When we do not assume that the 
coefficients of the equation are Markovian, the situation is worse since the apparatus 
of differential equations is not available. We shall now consider the main objective 


1973] STOCHASTIC EQUATIONS AND THEIR APPLICATIONS 539 


of our study of stochastic equations: the construction of effective approximation 
methods. Before we can state results in this direction we must first clarify the sense 
in which approximations will be considered. 

Let us return to the harmonic oscillator example of section 2. We rewrite it 
here in a slightly different notation: 


( ot _ du, 
(5.1) Maat k (t Ju = 0, u(0) = Uo, Fy = Uo. 


We assume that k’(t’) is a stationary random function with mean k, and write 
k'(t')= ko [1 + ox'(t’)]. 
Thus 
E{x'(t’)} — 0, E{x’'(t’)x’(s’)} — R'(t’ _ s’), 
[= { R'(s)ds < a. 
0 
The dimensionless parameter o* is the variance of the fluctuations of x’(t’) and 


R’(s‘) is its correlation function. We have also assumed that the correlation time | is 
finite. Let us define a dimensionless time scale 


t=t'/l. 
Then u as a function of t satisfies the stochastic equation 
d*u du 
(5.2) 7D +kou+ex(thu=0, u(O)= uo, a = It. 


Here we have introduced the notation 
72 
x(t)=x'(th, ky = Kot , &=koo. 


In system form (5.2) becomes 


# (2)=a3)+ex0(3)-C)o=(c) 


A= (1, o) X= (Lacy 0): 


The role of ¢ in the dimensionless formulation of the oscillator problem is now 
clear. We shall assume ¢ < 1 and allow ¢ to be large. Since E{X(t)} = 0, stochastic 
effects will become important for times t of the order 1/e* (in the t units the cor- 
relation time is 1). This corresponds to case (ii) of section 2. Therefore, straight- 
forward perturbation expansion in power series of e and then averaging of the result 
will not yield interesting results since they are valid only for short times. We shall 


(5.3) 


540 G. C. PAPANICOLAOU [May 


state below a limit theorem due to Has’minskii [9] and Stratonovich [25] which 
‘characterizes the behavior of the stochastic process defined by (5.3) in the limit 
¢—+0, t— 00, e*t = constant. 

The result is actually more general. It applies to a class of stochastic equations 
which we shall describe. Before doing this, however, we shall motivate the form 
of the general problem (see (5.4) below) by reconsidering (5.3). 


Let z(t) = (25) be defined by 


Z(t) 
nee () 


From (5.3) we find that the vector-valued process z(t) satisfies the equation 


da(t) = ele “X(t)e“] z(t), 2(0) =z) = (‘2° ) . 
dt lito 
We shall call z(t) the slowly varying part of the process (“) = e“‘z(t) because the rate 
of change of z is of order e. Our main interest is the characterization of the limit of 
the process z(t), which depends on ¢, as e tends to zero, t tends to infinity and et 
remains fixed. We now state the Stratonovich-Has’minskii theorem which provides 
a very satisfactory answer to this question for a large class of problems. 

Consider the stochastic process z(t) with values in R" defined by the equation 


dz(t) 


(5.4) °F 


= ¢ F(z(t), x(t), t), z(0) = Zo. 


Here x(t) is an R™ valued stochastic process and F is a mapping of R"x R™ x [0, 00) 
into R" such that for some constant C 


OF, 


< C, l, i, = jyott, ’ 
02,02, J k J " 


aF, 
(5.5) | Fi] <C, az, <C, 


uniformly in z, x and t. F is a measurable function of x and t for fixed z. Assume that 
(5.6) E{F(z, x(t), t)} =0 


and the limits 


tot+tT ps n ; 
(5.7) lim i i) | E >» OF (2, x(s), )p j(z, x(a), 0)} do ds = b,(z), 
T-3o@ T to to j=1 (6h4 . 
; 1 tot+T Ss 
(5.8) tim # [ | E{F,(z, x(s), s)F (z, x(a), «)} do ds = a;,(z), 


exist uniformly in tg and z. Assume further that if Ul, O0<s<t oo, denotes the 
o-algebra of events generated by x(c), sS oS t, then 


1973] STOCHASTIC EQUATIONS AND THEIR APPLICATIONS 541 


SUP , | P(B| A) — P(B)|< B(s), 


BeU 


(5.9) 
and P(s)|0 as soo so that s®B(s)|0 as well. Under the above assumptions the 
process z‘*(t) defined by 


(5.10) Z\)(7) = z(t/e?), t= e7t, 


converges weakly as e > 0, 0 St St, to a Markov process z°(t) which is continuous 
with probability 1 and whose infinitesimal generator is given by 


(5.11) L= E a4,(2) so + xX b,(z) on 
ij =1 j=l J 

This remarkable result was enunciated by Stratonovich [26] who applied it 
successfully to a number of problems. A somewhat different form of the above 
formulation of the theorem, its proof, and several interesting remarks can be found 
in [25]. Here we shall present an application of the result to partial differential 
equations using the representations of section 4. 

Suppose that x(t) in (5.4) is an R™ valued diffusion Markov process which has an 
everywhere positive transition probability density P(t; x,x )) and a unique positive 
invariant measure p(dx). Suppose all conditions of Has’minskii theorem are 
satisfied. For any bounded continuous function on R"*™ the function 


(5.12) u(s,t; z,x) = E,,..¢ f(z), x()} 


satisfies the backward Kolmogorov equation 


x 5 Auge ae + LBs E Ffex8) ge. 


s<t, u(t,t; z,x) = f(z,x). 


Here A;(x) and B,(x) are the infinitesimal variance matrix and drift vector of the 
process x(t)and E,, ,{ } denotes expectation over the paths of the Markov process 
(z(t), x(t)) starting from (z,x) when t = s. Let 


o = 65, t = et, u(o,7,2,x) = u(o/e?, t/e2, z, x). 
Then, 
dul? 11 2 e7u) = ou" 
— 3S > Au) Rae, te, E Bfx a 
12 » ou 
+; Ra F (z, x, o/e7) ie, 


a<t, u(t,t3z,x) = f(z,x). 


542 G. C. PAPANICOLAOU (May 


It follows immediately from the Stratonovich-Hashminskii theorem and (5.12) that 
u“(0,t;z,x) converges uniformly in z, 0<tS¥ 7, to u(r, z), independently 
of x where u satisfies the problem 


dy 
So = Eu, w0,2) = [ fe,x0ldn, 


T 
The correct initial conditions for u‘® are obtained in a manner similar to that 
used in obtaining the result for (4.23). The coefficients a,;(z), b,(z) in the operator L 
are obtained from (5.7) and (5.8) where E{ } denotes expectation over the paths 
of x(t) starting at t = 0 with the invariant measure p(dx). Thus in (5.7) and (5.8) 
the expectation can be written explicitly as integration with respect to appropriate 
weights involving p(dx) and P(t,x,x 9). Let us note that the singular perturbation 
result we have just deduced, even formally, 1s a nontrivial result to obtain. 

When x(t) is a Markov chain we can again apply the Stratonovich-Has’minskii 
theorem and obtain a singular perturbation result similar to that of (4.23) where 
a hyperbolic system converges to a diffusion equation. Now, however, we may 
allow F to depend on z,x,ft in any manner compatible with the hypothesis of the 
theorem stated above. 

The significance of the Stratonovich-Has’minskii result can best be understood 
when it is seen as a generalization of the classical central limit theorem to processes 
that are functionals of asymptotically independent processes (1.e., condition (5.9) 
holds) defined via a differential equation. Indeed if F(z,x,t) = x then we trivially 
obtain the classical central limit theorem [4]. The usefulness of this generalized 
central limit theorem is, in practice, limited severely by the fact that few diffusion 
equations with variable coefficients are solvable explicitly. In any case, this is the best 
that can be achieved under the circumstances. 


6. Examples. In this section we shall apply the result of Has’minskii and Strato- 
novich to two problems. One is the harmonic oscillator problem (5.3) and the other 
is the wave propagation problem (2.20). 

Let us transform (5.3) by introducing slowly varying dependent variables. We 
note that F(z, x,t) is linear in z here and thus (5.5) is violated. To circumvent this 
difficulty we change coordinates appropriately. Let us write (5.3) in slowly varying 
form: 

“4 


Zz, = ex(t) [ies (,/Kot) cos (,/Kol) +72sin?( Jk) | z4(0) = z1 
0 


Z, = ex(t) |- z, cos?(,/kot) — cf sin (\/Ko!) cost / Fat) , 2,(0) = z>. 


0 


The change of variables 


Z, = &cos8, z, = —./koe’sin 6, —-o<r<0o,0S5 608 2z, 


1973] STOCHASTIC EQUATIONS AND THEIR APPLICATIONS 543 


leads to a system of equations of the form 
Fo = ex(t)gi(9,t), rO)=7Po, 
0 ex(t)g2(9, t), (0) = oy ° 


The functions g, and g, are trigonometric polynomials in 6 and ¢ and thus in the 
representation (6.1) the oscillator problem satisfies (5.5) provided x(t) is a bounded 
random process, which we now assume. We assume further that x(t) is stationary 
with mean zero, has correlation function E{x(t + s)x(t)} = R(s), and satisfies (5.9). 
Thus all hypotheses are satisfied since explicit computation shows that the limits 
(5.7) and (5.8) exist uniformly in ft, and are independent of r and @. It follows there- 
fore that L, defined by (5.11), has constant coefficients and is given by 


(6.1) 


L= bo” 9 1? oy @ 
= 3 oe Tart NF Ta) Bg2 to 5 
a= Ko S(O) ,bo= kp Res ew ko) c= ky Im 52 0) 
4 4 4 
S(@) = [ R(a) edo. 
0 


The statistical properties of the limit process z°°(t) can be obtained from those 
of rr) and 6 (rt) whose transition probability density is the solution of the 
initial value problem 


OP/dt = LP, P(0,r,0; 1,99) = (r—1o)5(9 — 9p). 


This equation can be solved explicitly since it has constant coefficients. Then by 
transforming to the original variables in (5.3) we find that 


E{u(t)} = exp{ko/4[Re S(2,/ko) — S(0)]e?t} 


cos (ig Stel Sev io), + o(1), 


O<stS I/e*. 


(6.2) 


An interesting feature of (6.2) is that the random fluctuations in the spring constant 
cause the mean displacement to decay with time on the slow time scale e7t. Similarly 
we find that 


E{ur(t)} = ; exp {3 [Re S(2\/ko) — 25(0)]e%1 COs (2V ko — 52 tm(2 Jha) 


(6.3) 
+ $exp(ky Re S(2,\/ko)et) + o(1), O<t< I1/e?. 


The result (6.3) shows that the mean square displacement is growing on the time 


544 G. C. PAPANICOLAOU [May 


scale e?t. This follows from the fact that Re S(w) is the Fourier cosine transform 
of a correlation function and hence is positive. 
Let us now consider problem (2.20). We assume that 


n?(x) = 1 + €z(x), 


where x is the ‘‘time’’ variable and z(x) is a bounded zero mean stationary stochastic 
process for which (5.9) holds. 

We wish to find E{| T |7}, the mean square of the transmission coefficient, when 
L, the width of the region of the inhomogeneities, is of order 1/e” . One way of treat- 
ing (2.20) is presented in [18], [21]. Another way [20] is to find a stochastic equa- 
tion for T = T(L) as a function of the width Land apply the theorem of section 5 
to it. We omit the calculations here and state the result: 


ee) 2,—x2 
BT) P} = eet [| ~ 2 O_o, 
/t 0 cosh(e/sL x) 


OS LS I1/e’, 


s= ua [° R@e0s (2kt)dt, R(t) = E{z(x + 1)z(x)}. 
0 


From this result we find that the mean power transmission coefficient is approxi- 
mately 1/2 when the dimensionless quantity e*sLis of order 1. 

Some other applications of an operator theoretic version of this result [22] 
are given in [2]. Other interesting applications are considered in [19]. 


Acknowledgment. The author wishes to thank the editor, and Professor J. B. Keller, for sugges- 
ting several improvements in the manuscript. 


References 


1. W. E. Boyce, Random eigenvalue problems, in Probabilistic Methods in Applied Mathematics, 
A. T. Bharucha-Reid, editor, Academic Press, New York, 1968. 

2. R. Burridge, and G. C. Papanicolaou, The geometry of coupled mode propagation in random 
media, Comm. Pure Appl. Math., to appear. 

3. E. A. Coddington, and N. Levinson, Theory of Ordinary Differential Equations, McGraw- 
Hill, New York, 1955. 

4. W. Feller, An Introduction to Probability Theory and Its Applications, Vols. I, II, Wiley, 
New York, 1968. 

5. U. Frisch, Wave propagation in random media, in Probabilistic Methods in Applied Mathe- 
matics, A. T. Bharucha-Reid, editor, Academic Press, New York, 1968. 

6. I.,M. Gelfand, and A. M. Yaglom, Integration in functional spaces and its applications to 
quantum physics, J. Math. Phys., 1 (1960) 48-69. 

7. I. I. Gikhman and A. V. Skorokhod, Introduction to the Theory of Random Processes, Saun- 
ders, Philadelphia, 1969. 

8. R. Griego and R. Hersh, Theory of random evolution with applications to partial differential 
equations, Trans. Amer. Math. Soc., 156 (1971) 405-418. 

9. R. Z. Has’minskii, A limit theorem for the solution of differential equations with random 
right-hand sides, Theory of Prob. and Applications, 11 (1966) 390-406. 


1973] THE MAIN CRISES 545 


10. M. Kac, On the distribution of certain Wiener functionals, Trans. Amer. Math. Soc., 65 
(1949) 1-13. 

11. , Some Stochastic Problems in Physics and Mathematics, Magnolia Petroleum Co. 
Lectures in Pure and Applied Science, No. 2, 1956. 

12. , Probability and Related Topics in the Physical Sciences, Interscience, New York, 
1959. 

13. A. I. Khinchine, Asymptotische Gesetze der Wahrscheinlichkeitsrechnung, Chelsea, New 
York, 1948. 

14. R. Kubo, Stochastic Liouville equation, J. Math. Phys., 4 (1963) 174-183. 

15. M. Lax, Classical Noise IV; Langevin Methods, Rev. Mod. Phys., 38 (1966) 561-566. 

16. H. P. McKean Jr., Stochastic Integrals, Academic Press, New York, 1969. 

17. M. Loéve, Probability Theory, Van Nostrand, Princeton, N. J., 1963. 

18. J. A. Morrison, Application of a limit theorem to solutions of a stochastic differential 
equation, J. Math. Anal. Appl., 39 (1972) 13-35. 

19, J. A. Morrison and J. McKenna, article in Proceedings of Symposium on Stochastic Equa- 
tions, SIAM-AMS, vol. 6, to appear. 

20. G. C. Papanicolaou, Wave propagation in a one-dimensional random medium, SIAM J. 
on Appl. Math., 21 (1971) 13-18. 

21. , and J. B. Keller, Stochastic differential equations with applications to random har- 
monic oscillators and wave propagation in random media, SIAM J. on Appl. Math., 21 (1971) 
287-305. 

22. , and R. Hersh, Some limit theorems for stochastic equations and applications, 
Indiana Univ. Math. J., 21 (1972) 815-840. 

23. M. Pinsky, Differential equations with a small parameter and the central limit theorem for 
functions on a finite Markov chain, J. Wahrscheinlichkeitstheorie Vern. Gebiete, 9 (1968) 101-111. 

24. , Multiplicative operator functionals and their asymptotic properties, in Advances 
in Probability, vol. 3, Marcel Dekker, New York. 

25. R. L. Stratonovich, Conditional Markov Processes and Their Application to the Theory of 
Optimal Control, Elsevier, New York, 1968. 

26. , Topics in the Theory of Random Noise, Vol. I, II, Gordon and Breach, New York, 
1963. 

27. W. M. Wonham, Random differential equations in control theory, in Probabilistic Methods 
in Applied Mathematics, Vol. 2, A. T. Bharucha Reid, editor, Academic Press, New York, 1970. 


THE MAIN CRISES 
S. BIRNBAUM, Bronx Community College 


In the November 1971 issue of this magazine, Gail S. Young discussed some crises 
of the mathematical world in seven parts. I think the crises listed in sixth and seventh 
place such as the Viet Nam war and unemployment should be in first place. Further- 

more, I must disagree with a statement such as this: “Some of the problems — for 
example the war in Viet Nam — are ones that we as mathematicians, or the organi- 
zation we represent, can do nothing much about.” 

This article does not represent an official position of either the Editors of the American Mathe- 
matical Monthly, or the Mathematical Association of America. Approximately once a month we 


receive a suggestion for starting a new section of the Monthly, and fortunately or unfortunately, we 
cannot accept almost all such suggestions. Editor. 


546 S. BIRNBAUM 


The employment crisis is mentioned as “‘another conceivably controversial ex- 
ample.” I’m sure that young Ph.D.’s who have failed to obtain employment would 
want this treated as problem number one. 

I think that we as mathematicians and citizens must come to effective grips with 
the above-mentioned and other problems which come under the heading of socio- 
economic-political issues. It is very comforting to believe that mathematics could 
make great contributions in helping society solve its problems — as goes mathe- 
matics so goes society. But it is even truer that as goes society so goes mathematics. 
Thus the mere existence of problems requiring the participation of mathematicians 
for their solution does not guarantee full employment for mathematicians. We don’t 
need any America and Africa that have the gravest problems in feeding their popula- 
tions employ the least numbers of mathematicians. 

I am one of those optimists who believe that the more mathematicians employed 
for peaceful purposes in any country, the better off it must be. From this it follows 
that I, like most mathematicians, would prefer to see increased production and 
employment of mathematicians. If such is to the interest of our country and its 
people, then it is our duty to tackle the problem of unemployment or underemploy- 
ment and help solve it for the general good. Indeed, failure to work actively for full 
employment may result in loss for not a few mathematicians who fancy themselves in 
secure positions. 

Two years ago I wrote a little article about the spectre of unemployment which 
would increasingly haunt us mathematicians among others. In it I predicted that our 
economy was approaching a stage of chronic crisis. When I showed it to some of my 
colleagues and asked if I should submit it for publication they told me, “‘Go ahead 
but it won’t be printed!” 

This is the attitude we must put behind us. We must try to effect the development 
of this country in a peaceful and progressive direction. We must agree that this coun- 
try has many problems requiring the participation of mathematicians in their solu- 
tion; but we must also recognize that this is no guarantee of full employment. If we 
agree with all this then we enter the frontier of the most controversial part of this 
analysis: Concretely, how can we go about helping to direct this country onto a 
path that is best for its people and thus its mathematicians? At this point, a state- 
ment of a concrete program for action in the MONTHLY would be in opposition to the 
tradition of avoidance of controversial issues. Therefore the first step must consist in 
realizing that the responsibility of the mathematical community in influencing our 
country in this time of growing crisis takes precedence over tradition. We must be 
willing to enter the area of dialogue over controversial issues. To this end we can make 
a modest beginning by reserving, in the MONTHLY, a small section to be devoted 
to discussing the socio-economic-political problems ‘bugging’ our members. 

To the younger and the impatient members this may be a mouse of a conclusion 
for a mountainous beginning; but it is a first step and as such, small as it is, is the 
most important one. 


MATHEMATICAL NOTES 
EDITED BY ROBERT GILMER 
The present backlog for this Department is substantial. Until further notice, new manuscripts 


cannot be accepted. This moratorium will probably continue until June 1, 1973; authors are 
requested to hold their manuscripts pending a further announcement. 


THE SIGN OF THE BERNOULLI NUMBERS 
L. J. Morpety! 


The Bernoulli numbers are defined by the usual expansion, 


(1) 1 + YD (- 1)" Bx" /(2n)! = L 
e~ — | 2 net n=0 
say. It is classic that B, > 0. 
A proof by Euler follows from his formula, 


n—-1 
Qn +B, = E (7) )B,Bian 
r=1 2r 
for which an involved proof is given in Nielsen [1]. Well known are the analytic 
proofs leading to 
(2r)! & 1 


J2r-17,2F n=l n2r° 


B, 


A proof using Bernoulli polynomials is given in Nérlund [2]. Lastly, an arithmetic 
proof is given by Uspensky and Heaslet [3], which is rather complicated and 
unilluminating. 

I am not aware of any proof based upon the simplest principles and so the one 
now given may be of interest. We have 


x 
x 
bdo 
xa 


ie 8) b.x" 
_ ~= = © (1-292. 


e~+i1 e*—1 e*—1  ,29 r 


Multiply both sides by x/(e*— 1) and substitute from (1) the expansion for 
2x /(e?*— 1). Then 


x 


b,x" > b,x’ 
! s! 


s=0 


2, 


Equate coefficients of x?" on both sides. Since b, is zero if nis > 1 and odd, we have 


T Deceased March 11, 1972. 
547 


548 L. CARLITZ AND R. SCOVILLE [May 


re . b, re b, 
2 a ~2 or » s! 
The terms with r = 0,1 contribute nothing to the left hand side. Isolating the term 


with r= 2n,s = 0, we have 


=0,(r+s=2n, r>0, s>O0). 


b (1 — 2"b,b 
__ 942n 2n rs 
(1 2) Bayt a8! 


where we may suppose that r and s are both even. We assume now that (— 1)"~*b,,, 
> 0 for1<msn-—41. This is true for m = 1, and we prove that it holds for m = n. 
The term in the summation has the sign 


(— {yi tO/A~ 2 /20~ 1 (— 1)"~4. Hence (— 1)"~*b,, > 0, 


= 0, (r+s=2n, rs £0), 


and this finishes the proof. 


References 


1. N. Nielsen, Traité élémentaire des nombres de Bernouilli, Gauthier Villars, Paris, 1923, p.42. 

2. N. H. Noérlund, Vorlesungen iiber Differenzenrechnung, Chelsea, New York, 1924, pp. 22-23. 

3. J. V. Uspensky and M. A. Heaslet, Elementary Number Theory, McGraw-Hill, New York, 
1939, Chapter IX. 


THE SIGN OF THE BERNOULLI AND EULER NUMBERS 
LEONARD CARLITZ and RICHARD SCOVILLE, Duke University 


Put xcotx = DL )(—1)"B,,x7"/(2n)!. It is well known that 
(1) (—1)""*B,, >0 (n = 1,2,3,:-+). 


For proofs see for example [2, Ch. 2] and [3, Ch. 9]; another simple proof has 
recently been given by Mordell [1]. 
Moreover, if we put 


l 
Ms 


tan x Tone 1X" */(2n + 1)!, 


n=O 


secx = b (—1)"E,,x7"/(2n)!, 
n=0 


it is known [2, Ch. 2] that 
(2) Ton+1 > 0, (—1)"E,,, > 0. 


It may be of interest to note that (1) and (2) can be proved very simply in the 
following way. Differentiation of tan(arctanx) = x gives 


1973] MATHEMATICAL NOTES 549 


tan’(arctan x) = 1 + x?. 
A second differentiation gives 
tan”(arctan x) = 2x(1 + x’), 
while a third yields 
tan” (arctan x) = (2 + 6x7)(1 + x?) 
and so on. After k steps we get 


= (1+ x?)D)"x|,.=0> 


which evidently implies T,,., > 0. 
Similarly, differentiation of sec(arctanx) = ,/1 + x? gives 


sec’(arctan x) = x,/1 + x?, 
sec’"(arctan x) = (1 + 2x?),/1 + x?, 
sec"(arctan x) = (5x + 6x?) /1 + x?, 
and so on. This yields 
E,, = (1 + x?)D)?"\/1 + x?|,29 > 0. 
To prove (1) we note that xcotx — 2xcot2x = x tanx, so that 
(3) (=) = 2)Bzy = 2nTy,-y (n> 0). 


Hence (1) is implied by T,,_, > 0. 

We remark that by the method used in proving (2) we can show for example 
that the coefficients of sec*x are positive for 1>0. The coefficients, except for. 
sign, are Euler numbers of higher order [2, Ch. 6]. 


References 


. J. Mordell, The sign of the Bernoulli numbers, this MONTHLY, (preceding pages). 
. E. Nérlund, Vorlesungen iiber Differenzenrechnung, Teubner, Leipzig and Berlin, 1924. 
. V. Uspensky and M. A. Heaslet, Elementary Number Theory, McGraw-Hill, New York, 


BASICALLY BOUNDED SETS AND A GENERALIZED HEINE-BOREL THEOREM 
NEIL HINDMAN, California State College, Los Angeles 


1. Introduction. The concept of boundedness plays an important role in the 
theory of metric spaces and has been defined in the contexts of topological vector 


550 NEIL HINDMAN [May 


spaces and uniform spaces. There is thus some inherent appeal in generalizing the 
notion to arbitrary topological spaces. This has been explored in a general setting 
by Hu [1]. 

The advantage of the current notion is that it is defined for arbitrary topological 
spaces, yields a universal form of the Heine-Borel theorem, and generalizes the usual 
notions of boundedness for as wide a class of spaces as is possible in view of this 
theorem. The concept derives from an idea of Raymond Killgrove, to whom this 
author is indebted. 


DEFINITION. A subset A of a topological space X is basically bounded if each 
basis for X has a finite subfamily covering A. 


2. Generalizations of the Heine-Borel and Bolzano-Weierstrass theorems. 


THEOREM | (Generalized Heine-Borel). Each closed and basically bounded 
subset of a topological space X is compact. 


Proof. Let A be closed and basically bounded in X and let y be an open cover 
of A. Let B = {U:U is open in X and either UN A= @ or US YV for some V in 
y}. Now B is a base for X since A is closed. The finite subfamily of B which covers A 
guarantees that a finite subfamily of y covers A. 


THEOREM 2 (Generalized Bolzano—Weierstrass). Each infinite basically bounded 
subset of a topological space has an accumulation point. 


Proof. Any infinite basically bounded set without accumulation points would 
be closed, hence compact, by Theorem 1. 


THEOREM 3. Each basically bounded sequence in a topological space clusters. 
[f a sequence converges then it is basically bounded. 


3. Generalization of common boundedness notions. Recall that a subset A of 
a topological vector space L over a field K is said to be bounded if for each neigh- 
borhood V of 0 there is an element 1 of K such that A ¢ JV. A is said to be totally 
bounded if for each neighborhood V of 0 there is a finite subset F of L such that 
ACF + V. Recall also that a subset of a uniform space X is said to be totally bounded 
if for each member V of the uniformity there is a finite subset F of X such that 
Ac V[F]. 


THEOREM 4. Let X be a pseudo-metric space or a topological vector space. 
The following statements are equivalent: 

(a) Ifasubset of X is bounded (respectively totally bounded) then it is basically 
bounded. 

(b) A subset of X is bounded (respectively totally bounded) if and only if 
it is basically bounded. 


1973} MATHEMATICAL NOTES 551 


(c) Each closed and bounded (respectively closed and totally bounded) 
subset of X is compact. 


Proof. The proof will be done in case X is a topological vector space (over the 
field K). The other case is similar. 

(a) + (b). Let A be a basically bounded subset of X and let V be a neighborhood 
of 0. Let B = {x + U:xeX and U is an open neighborhood of 0 contained in V}. 
Then f is a base for X and so A is contained in finitely many translates of V. A is 
thus totally bounded and hence bounded. 

(b)->(c). Any closed and bounded (respectively closed and totally bounded) 
subset of X is closed and basically bounded, hence compact. 

(c) +(a). Let A be a bounded (respectively totally bounded) subset of X. Then 
cl A is bounded (respectively totally bounded) so is compact. Any compact set is 
basically bounded. 


THEOREM 5, Let X be a uniform space. The following statements are equivalent: 
(a) If a subset of X is totally bounded then it is basically bounded. 

(b) A subset of X is totally bounded if and only if it is basically bounded. 

(c) If a subset of X is closed and totally bounded then it is compact. 


Proof. (a) — (b). Let A be a basically bounded subset of X and let U be a member 
of the uniformity. Let B = {W:W is open and W < U(x) for some x in X}. Then 
B isa basis for X so there is a finite subfamily of B containing A. Hence Ac 
U;=,U(x;) for some {x;}7_, & X. 

The rest of the proof is identical to that of Theorem 4. 

As previously remarked, Theorems 4 and 5 show that the basically bounded 
concept generalizes these five concepts for precisely as wide a class of spaces as is 
possible in view of Theorem 1. 

Hu [1] introduced the concept of compact bounded sets whereby a set is compact 
bounded if its closure is compact. Theorems 1, 2 and 3 also hold if “‘basically bounded’’ 
is replaced by ‘‘compact bounded’’. Clearly each compact bounded set is basically 
boundeb and it is easily proved that ina regular space the converse holds. The follow- 
ing example shows that the basically bounded concept is indeed more general 
than that of compact boundedness. 


Example. A Hausdorff space with a basically bounded subset which 1s not 
compact bounded. 

Let X be the closed interval [0,1] and let A = X\Q. Lett! = {UU(V NA): U 
and V are open in the usual topology on [0,1]}. T’ is easily checked to be a Hausdorff 
topology on X. To see that A is basically bounded let 8 be any base for X. We may 
write B = {U; U(V; NA): bE A}. Then {U;:5EA}U {V;:5eA) is an open cover 
of [0,1] in the usual topology hence has a finite subcover {U,:6€A’}U {V3 :66A’} 
But then A S Use4' (Us U(Vs 1 A)). 


552 A. K. PIZER [May 


Now let X,)¢A and let {S,},-, be a sequence of rationals converging (in the 
usual topology) to X). Then A is a neighborhood of X, missing {S,},-n So X is 
not compact while X = cl A. 


Reference 


1. S.-T. Hu, Boundedness in a topological space, J. Math. Pures. Appl., 28 (1949) 287--320. 


RESEARCH PROBLEMS 
EDITED BY RICHARD GUY 


In this Department the Monthly presents easily stated research problems dealing with notions 
ordinarily encountered in undergraduate mathematics. Each problem should be accompanied by 
relevant references (if any are known to the author) and by a brief description of known partial 
results. Manuscripts should be sent to Richard Guy, Department of Mathematics, Statistics, and 
Computing Science, The University of Calgary, Calgary 44, Alberta, Canada. 


A PROBLEM ON RATIONAL FUNCTIONS 
A. K. Pizer, University of California, Los Angeles 


E. Straus and W. Adams [1] have shown that a nonconstant polynomial with 
complex coefficients is determined by the preimages of two points. More precisely, 
after normalizing, we have 


PROPOSITION 1 (Straus and Adams). Let p(z) and q(z) be polynomials, not 
both constant, with coefficients in C, the field of complex numbers. Assume p(z) 
and q(z) have the same set of zeros and the same set of preimages of 1, i.e., 


P(Zo) = 0 <> q(zo) = 0 
P(Zo) = 1 > dz) = 1 = for ZEC. 
Then p(z) = q(z). 

Proof. Assume n = deg p(z) 2 deg q(z). Consider the polynomial F(z) = p’(z) 
(p(z) — q(z)). Then deg F(z) S$ 2n—1. But F(z) has 2n zeros (counted with multi- 
plicity) occurring at those values z) where p(z,.) = 0 or p(z,) = 1. Since p’(z) # 0, 
we see p(z) = q(z). 

It ys natural to ask (and in fact was asked by Straus and, independently, by J. G. 


Clunie) if the analogous statement is true for rational functions on the complex 
Riemann sphere. The normalized question reads 


Question I. Let f(z) and g(z) be two nonconstant rational functions on the 
Riemann sphere having the same sets of zeros, poles, and preimages of 1. Then 
does it follow that f(z) = g(z)? 


1973] RESEARCH PROBLEMS 553 


The answer to Question I is negative. In fact 


—429 
JM) = Go ED 
and 
—4z 


Go(Z) = (z _ 1)(z 4. 1)3 
satisfy the hypothesis of the question, but are not identical. 


REMARK. Nonconstant rational functions are determined by the preimage of 
four points. The proof is analogous to that of Proposition 1. See Theorem 3 in [1]. 

Let us say that a pair of rational functions are associated if they satisfy the hy- 
pothesis of Question I. Notice that if f(z) and g(z) are associated and q(z) is any 
rational function, then f(q(z)) and g(q(z)) are also associated. 

Letting B(z) = —1/z, we see that g)(z) = fo(P(z)) and letting t(z) = (iz + i)/ 
(—z+1), we find 


Li(2) = folt(z)) = eae 2 


and 


are associated and g,(z) = /f,(—z). Thus g,(z) = f\(a(z)), where « is a rotation 
of order 2. Letting g,(z) = 9,(z"), f(z) = f(z") we get an associated distinct pair 
F(z), 9,(Z) such that g,(z) = f,(y(z)), where y is a rotation of order 2n. Thus we see 
that for any fractional linear transformation wu of finite even order, there exists an 
associated distinct pair f(z), g,(z) of rational functions such that g,(z) = f,(u(z)). 
This gives rise to 


Question II. For every linear fractional transformation +t of finite order (it 
suffices to consider rotations of prime order) does there exist an associated distinct 


pair f(z), g(z) such that g(z) = f(c(z))? 
A more intriguing question is 


Question III. Does there exist an associated distinct pair f(z), g(z) which is 
not of the type mentioned above, i.e., for which there do not exist rational functions 
F(z), G(z), q(z) and a fractional linear transformation f such that f(z) = F(q(z)), 
g(z) = G(q(z)) and G(z) = F(f(z))? 


This work was supported in part by NSF Grant GP-28696. 


Reference 


1. W. Adams and E. Straus, Non-Archimedian analytic functions taking the same values at 
the same points, Ill. J. Math., 15 (1971) 418-424. 


CLASSROOM NOTES 
EDITED BY ROBERT GILMER 


Material for this Department should be sent to David Roselle, Department of Mathematics, 
Louisiana State University, Baton Rouge, LA 70803. 


A CONDITION UNDER WHICH A MAPPING IS A HOMEOMORPHISM 
W.R. Derrick, Arizona State University 


The purpose of this note is to present an elementary proof, suitable for a first 
course in topology, of a special case of a theorem of Whyburn [1, p. 116]. The proof 
provides a useful application of the Jordan Curve Theorem and can be used as a 
first step leading into the study of light open mappings. 

Consider sets in the Euclidean plane; let D denote the closed unit disk. For any 
function f defined on D designate points in the domain by the letter z and in the 
range by w. Let Bdry D and Int D be the boundary and interior of D respectively. 


THEOREM. Let f:D-—>D be a continuous function which maps Bdry D homeomor- 
phically onto Bdry D and is a local homeomorphism on Int D. Then f is a homeomor- 
phism. 


Proof. Since D is compact and the Euclidean plane is a Hausdorf. space, we 
need only prove that f is one-to-one and onto. 

Suppose fis not onto. Then there is a projection I1:f(D) > Bdry D, and the mapping 
f~* IIf is a retraction of D onto Bdry D, which is impossible. 

The preimage of every arc in f(Int D) consists of a set of disjoint arcs in Int D, 
since otherwise we would not have a local homeomorphism at any point of intersec- 
tion. Furthermore, since fUint D) = Int D, the preimage of an arc meeting Bdry D 
only at an endpoint w* consists of arcs meeting only at f~ ‘(w*). 

Suppose Zo, Zo are distinct preimages of wy and a is the radial arc joining wo 
to w*, its nearest point on Bdry D. (For wo = 0 take any radial arc.) By the con- 
tinuity of f there exist arcs dy,a9 joining z),z, to z*, the preimage of w*, respectively, 
such that f(ao) = f(ao) = a. Select w,,w, on Bdry D and denote by 2z,,z, their 
preimages, let b and c be the straight line arcs joining wy to w, and wg, respectively, 
and designate by b,,b{ the arcs joining Z9,z') to z; which satisfy f(b,) =f(b;) = 8, 
and by c,,c, the arcs joining zy,zg to z, such that f(c,) = f(c2) = c. Let d, and d, 
be the arcs on Bdry D with endpoints z,,z* and z,,z* and satisfying d; Nd, = z*. 
Then by the Jordan Curve Theorem 2p lies inside the simple closed curve d, U by U 
c, Ud because otherwise ay would meet b, or c,. If Zo lies inside the simple closed 
curve dy Ud, Ub,, then c; meets ag or b,, and if zo lies inside ag Uc, Ud, then 


554 


MATHEMATICAL EDUCATION 555 


b, meets ay Or c,. In either case we have a contradiction, thus each point in f(D) 
has a unique preimage. 


Reference 


1. G. T. Whyburn, An open mapping approach to Hurwitz’s theorem, Trans. Amer. Math. 
Soc., 71 (1951) 113-119. 


MATHEMATICAL EDUCATION 


Epirep BY J.G. Harvey AND M.W. POWNALL 


Material for this Department should be sent to Shirley Hill, Department of Mathematics, 
University of Missouri, Kansas City, MO 64110, or to Paul Mielke, D2partment of Mathe- 
matics, Wabash College, Crawfordsville, IN 47933. 


INDEPENDENT STUDY FOR UNDERGRADUATES 


W. C. RAMALEY, Colorado College 


In college catalogs there often occurs a listing for the Mathematics Department 
of “Independent Study’’. A great many diverse activities occur under this rubric. 
What follows is about some of these activities and their importance to the students, 
to the college, and perhaps, to the graduate school. The conclusions drawn come 
from my experience over the past 5 years at a college which devotes all its attention 
to a quality undergraduate education. 

At Carleton College “Independent Study’’ has had a long and dynamic history 
[4, 5]. In the 1970-71 academic year 23 students enrolled in the course. Their work 
could be classified as follows: 5 read a regularly offered course in a term the course 
was not offered, 4 read from special bibliographies prepared for a “‘reading course” 
(usually in history), 5 covered material that would be treated in an advanced course 
if Carleton could offer that course, 5 did directed research which may or may not 
have been related to a previous course, and 4 did truly independent research. Over 
the past 5 years the proportions and total numbers have remained fairly constant. 

Each type of activity makes different demands on the:student and on the professor 
avho supervises the activity. For a regular course being covered in a term when it 
is not taught, a faculty member who has taught the course may be able to spend 
only an hour or so a week with the student in addition to preparing and grading 
a final examination. A faculty member who has not taught the course should 
avoid this sort of independent study, unless he wants to learn the material himself. 
Even then, “‘learning-by-teaching’’ has obvious limitations that must be carefully 
considered, 


PROBLEMS AND SOLUTIONS 
EDITED BY Emory P. STARKE 


ASSOCIATE EDITorS: JOSHUA BARLAZ, Eric S. LANGFORD. COLLABORATING EDITORS: LEONARD 
CARLITZ, GULBANK D. CHAKERIAN, HASKELL COHEN, S. ASHBY FOOTE, ISRAEL N. HERSTEIN, 
Murray S. KLAMKIN, DANIEL J. KLEITMAN, ROGER C. LYNDON, MARVIN MARCUS, CHRISTOPH 
NEUGEBAUER, ALBERT WILANSKY, AND UNIVERSITY OF MAINE PROBLEMS GROUP: EARL M. L. 
BEARD, GEORGE S. CUNNINGHAM, CLAYTON W. DODGE, OSKAR FEICHTINGER, WILLIAM R. 
GEIGER, RAMESH GUPTA, GARY HAGGARD, PHILIP M. LOCKE, JOHN C. MAIRHUBER, CURTIS 
S. Morse, GRATTAN P. Murpuy, EDWARD S. NORTHAM AND WILLIAM L. SOULE, JR. 


All problems (both elementary and advanced) proposed for inclusion in this Department should 
be sent to E. P. Starke, 1000 Kensington Ave., Plainfield, NJ 07060. Proposers of problems 
are urged to enclose any solutions or information that will assist the editors. Ordinarily, prob- 
lems in well-known textbooks and results in generally accessible sources are not appropriate 
for this Department. No solutions (except those accompanying proposals) should be sent to 
Professor Starke. 


ELEMENTARY PROBLEMS 


Solutions of Elementary Problems should be sent to Problems Group, Mathematics Department, 
University of Maine, Orono, ME 044738. To facilitate their consideration, solutions of elemen- 
tary Problems in this issue should be typed (with double spacing) and should be mailed before 
August 31, 1973. 


E 2414. Proposed by J. G. Wendel, University of Michigan 


In one form of chess match 2n games are played, wins count 1 point each, draws 4, 
losses are worth 0. In order to win the match, the defender needs only score at least 
n, while the challenger must achieve at least n + 4. Suppose that the two players 
are of equal strength, and that the probability of a draw is a constant 6. Prove or 
disprove: the defender’s chance of keeping his title is an increasing function of 6. 


E 2415. Proposed by C. D. H. Cooper, Macquarie University, Australia 


Find all positive integers n having the property that each positive divisor (> 1) 
of n has the form a’ + 1 where a, r are integers and r > 1. 


E 2416. Proposed by F. T. Howard, Wake Forest University 


Let p,,---, p, be distinct primes, e,,--+,e, arbitrary non-negative integers, andra 
fixed positive integer. Prove that there are infinitely many positive integers n with 
the property that p;' (i =1,2,---,k) is the highest power of p,; which divides the 
binomial coefficient (*). 


E 2417. Proposed by Ioan Tomescu, University of Bucharest, Rumania 


The number of ways of filling a 2 x n rectangle with dominoes (i.e. with 1 x 2 


559 


560 ELEMENTARY PROBLEMS AND SOLUTIONS [May 


rectangles) is well known (see problem E 1470 [1962, 61]). On page 139 of his book 
Polyominoes, S. W. Golomb asks for the corresponding result for 3 x n rectangles. 

Let Yn) be the number of ways of covering a 3 x n rectangle with dominoes. 
Obviously W(n) = 0 if n is odd. Show that 


UQm) = ©. (3 + (2+ 3" +(V3 —DQ— 3". 


E 2418. Proposed by C. A. Nicol, University of South Carolina 


Characterize those subsets S of the natural numbers with the property that 
every sum of elements taken from S (repetitions allowed) is composite. 


E 2419. Proposed by A. W. Walker, Toronto, Canada 


Points G, H, I, O are the centroid, orthocenter, incenter and circumcenter of a 
scalene triangle A, N and P are the midpoints of line segments OH and IH, F is the 
contact point of the incircle and nine-point circle of A, E is the reflection of F in the 
right bisector of OH, and Lis the inverse of E in the circle on GH as diameter. Prove: 

(a) F and J are inverse in the circle with center N, radius NP; 

(b) if OF = ./ 3-OG, points L and I coincide; 

(c) lines FG, IL, OP concur. 


SOLUTIONS OF ELEMENTARY PROBLEMS 
A Test for Primality 


E 2355 [1972, 518]. Proposed by Arthur Marshall, Madison, Wisconsin 


Given any odd integer n > 3, let k and j be the smallest natural numbers such 
that kn + 1 and jn are squares. Prove that n is prime if and only if both k and j are 
greater than n/4. 


Solution by R. J. Evans, Jackson State College, Mississippi. All variables 
represent natural numbers. Suppose n is prime. Then n | j so that j =n>n/4. Also 
kn = (a — 1)(a + 1) forsomea, so that forsome b,a + 1 = nb. Thuskn = nb(nb — 2) 
=n(n—2)so k= n—2>n/4. Conversely, suppose n is composite. If n is a prime 
power, n= p", then j=1 or j=p according as r is even or odd. In either case, j<n/4, 
the desired result. Now suppose n is not a prime power, so there exist relatively 
prime Odd integers p and q greater than 1 such that n = pq. By the Chinese Remainder 
Theorem there exists an x such that 1<x<n-—1, p|(x — 1) and q|(x +1). Let 
a=min(x,n—x). Then a?—1=k,n for some k,. Thus k,n <a? <(n/2)*, so 
ksk,<n/4. 


Also solved by Problem Solving Group, Berne (Switzerland), John Christopher, A. P. Geist, 
M. G. Greening (Australia), C. V. Heuer & G. A. Heuer, Wells Johnson, L. Kuipers, O. P. Lossers 


1973 ELEMENTARY PROBLEMS AND SOLUTIONS 561 


(Netherlands), Carolyn MacDonald (partial solution), Helen M. Marston, L. E. Mattics, M. R. 
Modak (India), Kenneth Schilling, Nan-Shan Shou, E. P. Starke, Allen Stenger, Charles Wexler, 
and the proposer. 


Editor’s Comment. Shou generalized the problem and its solution by showing 
that the conclusion is valid for 2 and for any integer of the form p or 2p, p an odd 
prime. Charles Wexler noted that since necessary and sufficient conditions for pri- 
mality other than the definition and Wilson’s theorem are very rare indeed, the 
problem is of more than passing interest. 


The Cancellation Law for Convex Sets 
E 2358 [1972, 519]. Proposed by W. H. Ruckle, Clemson University 


Suppose that A and B are closed, convex sets and that C is bounded. Show 
that if A+ C = B+C, then necessarily A = B. 


Editor’s Comment. Both Richard Laatch and George Painter note that this 
problem is Lemma 2, p. 167 of Hans Radstrém, An embedding theorem for spaces 
of convex sets, Proc. Amer. Math. Soc., 3 (1952) 165-169. It should be noted that the 
elementary proof given in this reference, holds without change in a (real or complex) 
linear topological space. (See H. H. Schaefer, Topological Vector Spaces, Mac- 
millan, New York, 1964, 25-27.) 

Many solvers observe that obviously C must be nonempty. 

P. J. Zwier and the proposer provide examples which show that the assumptions 
on A, B and C are necessary. If C were not bounded, we could take C to be the 
real line with A = {0} and B = {1}. If A were not convex, we could take A = {0,1} 
and B = C = [0,1]. If A were not closed, we could take A = C = (0,1) and 
B = [0,1]. 

Also solved by Sheldon Axler, Ronald Evans, C. V. Heuer & G. A. Heuer, E. M. Klein, O. P. 


Lossers (Netherlands), Simeon Reich (Israel), Peter Renz, Ralph Seifert, Walter Stromquist, and 
Jack Zelver. 


Circular Regions Determined by Chords 


E 2359 [1972, 519]. Proposed by T. C. Brown, Simon Fraser University, 
Burnaby, Canada 


Place n distinct points on the circumference of a circle and draw all possible 
chords through pairs of these points. Assume no three chords are concurrent and 
let a, denote the resulting number of regions within the circle. Then the sequence 
,4;,4>,°** begins 1, 2, 4, 8, 16, 31, ---. What is a, in general? 

Solution by Norman Bauman, Nanuet, N. Y. Consider the more general problem 
of a region crossed by / lines with p interior points of intersection. One easily shows 
by induction that the number of disjoint subregions created is p+/+1. In the 
special case of the problem, n points about a circle determine (4) lines and (j) 
internal intersections. The answer is, therefore, a, = (7) + (4) + 1. 

Also solved by 54 other readers, 


562 ELEMENTARY PROBLEMS AND SOLUTIONS [May 


Editor’s Note. The problem is certainly not new. Readers point out that it has appeared in at least 
ten journals and books as well as the 1967 Santa Clara mathematics contest for high school students. 
The problem appears in Yaglom and Yaglom, Challenging Mathematical Problems, Holden-Day, 
1964, p. 108; in T. Murphy, The dissection of a circle by chords, The Mathematical Gazette, 74396 
(1972), pp. 113-115; and in Jay Graening, Induction, fallible but valuable, The Mathematics Teacher, 
Feb. 1971, pp. 127-131. 


Subrectangles of a Convex Body 


E 2360 [1972, 519]. Proposed by G. D. Chakerian, University of California, 
Davis 


A convex body in the plane is a convex set with non-empty interior. The width 
of a convex body is the minimum possible distance between parallel supporting lines. 
Show that if K is a convex body in the plane of width w and area A, then K contains 
a rectangle with dimensions ./A /4 by w/2. 


I. Solution by G. A. Converse and J. E. Wetzel, University of Illinois. Let 1, 
and I, be two parallel support lines to K at minimum distance w apart, and suppose 
that the two perpendicular support lines m, and m, are h apart. Since A S$ hw S h?, 
evidently h2 JA. There are contact points X and Y of K on |, and I, so that 
XYLI, (by an argument similar to that given on p. 117 of Yaglom and Boltyanskii, 
Convex Figures, Holt, Rinehart and Winston, 1961). Let P and Q be the intersections 
of the diagonals of the two rectangles with edge XY (see the figure). Then the 
quadrilateral XPYQ lies in K, POLXY, XY =w, and PQ =h/2. The rectangle 
whose vertices are the midpoints of the sides of the quadrilateral XPYQ lies in K 
and has sides w/2 and h/4; it is worth noting that it has area T = wh/82 A/8. 
Thus K surely contains the smaller rectangle with sides w/2 and JA/A; and, more- 
over, the side of length w/2 can be chosen to lie perpendicular to the support line 1,. 


II. Solution by the proposer. We establish a sharper result, namely, K contains 
a rectangle with dimensions ./A/2 by w/2. 


1973] ELEMENTARY PROBLEMS AND SOLUTIONS 563 


DEFINITION. A diameter of K is any chord whose endpoints lie on two parallel 
supporting lines. 


LeMMA. There exists a quadrilateral PORS inscribed in K such that PR and QS 
are diameters of K, with PR orthogonal to QS, and such that supporting lines of K 
through the vertices of the quadrilateral form a rectangle circumscribed about K 
(the sides of the rectangle need not be parallel to the diameters). 


Proof. We prove the lemma in case K is smooth and strictly convex. The general 
result follows by standard approximation arguments. In this case, to each diameter 
there corresponds a unique orthogonal diameter and a unique circumscribed paral- 
lelogram whose sides contain the endpoints of the diameters. If for some choice of 
direction for the initial diameter the corresponding parallelogram is not a rectangle, 
it is evident that a “‘rotation’’ of this configuration through 90°, interchanging the 
roles of the two diameters, will by continuity yield our circumscribed rectangle in 
some intermediate position. 

The midpoints of the sides of PORS are the vertices of a rectangle J contained in 
K. If a = b are the lengths of the sides of the rectangle circumscribed about K (with 
sides passing through P, Q, R, S), then the sides of J have length at least a/2 and b/2 
respectively. Since a 2 b 2 w, we see that both sides of J have length at least w/2. 
Also, AS ab $a’, so a/22 JA/2. Thus K contains a rectangle of the required 
dimensions. 


Partition Permutations 


E 2364 [1972, 663]. Proposed by G. J. Michaelides, University of South 
Florida 


Suppose that r is a positive integer and that (i,,i,,---,i,) is a partition of r into 
nonnegative integers. Show that if p is a prime factor of n which is relatively prime 
to r, then the number of (distinct) permutations of (i,,i,,---,i,) is divisible by p. 


Solution by D. M. Bloom, Brooklyn College. Let the partition consist of k 
distinct values j,,j2,°°',j, having respective multiplicities m,, m,,-++,m,. (Thus 
Mijy t+ Mj, +- +m,j,=rand my+m,+-:: +m,=n.) Since p/r, at least one 
_ Of the m’s (say m,) also is not divisible by p. Since p|n, it follows that P| (a): 
But the number N of distinct partitions is given by 


la ar etm aeom) 


Also solved by Irl Bivens, John Christopher, Ellen Hertz, Joseph Hoffman, F.T. Howard, Wells 
Johnson, James King, Harry Lass, M. R. Modak (India), Paul Stockmeyer, and the proposer. 


Thus p|N. 


Editor’s Comment. Stockmeyer dispenses with the requirement that p be prime by arguing that 
each prime-power factor q* of p must divide N. Howard establishes a similar result. 


ADVANCED PROBLEMS 


All solutions of Advanced Problems should be sent to J. Barlaz, Rutgers — The State University, 
New Brunswick, N. J.,08903. Solutions of Advanced Problems in this issue should be typed (with 
double spacing) on separate, signed sheets and should be mailed before August 31, 1973. Contribu- 
tors (in the United States) who desire acknowledgement of receipt of their solutions are asked 
to enclose self-addressed, stmped postcards. 


An asterisk (*) means neither the proposer nor the editors supplied a solution. 


5911. Proposed by Bill Knight, California Institute of Technology 
Let F,, be the nth term of the sequence defined by 
F,=(n+2)F,-,-(n-DF,-2, Fy =a, F, = b. 
Find an explicit formula for F,,. 
5912. Proposed by R. B. Kirk, Southern Illinois University 


Let X be a compact Hausdorff space, and let C denote the space of continuous 
functions on X. Assume that C can be written as a countable union of equicontinuous 
sets. Prove that X is finite. 


5913. Proposed by D. E. Daykin and J. K. Dugdale, University of Reading, 
England 


Let H be a real or complex Hilbert space and let x, y, z be points in H. We call 
«x,y,z» a triangle. As usual 


L(y, z) = {y + a&(y — z): wa scalar} 
is the line through y and z: and the distance of x from L(y, z) is 
p(x, L(y, z)) = inf {|| x—w | :we L(y, z)}. 
,c= |z—~x| ands=t(a+b+c). 


For convenience put a = | x—-yl,b= | y-Z 
Euclidean geometry suggests two definitions 


Ay = +p(x, LY, Z)) | y-2 | 
and 


A, = ,/s(s — a)(s — b)(s — ¢) 
for the area of triangle «x, y,z>. Compare the values of A, and A, and determine 
when ‘they are equal. 


5914. Proposed by D, E. Daykin, C. E. Linderholm and Albert Wilansky, 
University of Reading, England. 


Show that if A = {z,,2Z,,--+,Z,} is a finite set of n complex numbers, there is a 
subset B of A such that 


564 


ADVANCED PROBLEMS AND SOLUTIONS 565 


uz 


zéeB 


>n-} Dy | z;,|. 
1sSisn 


5915*. Proposed by D. M. Battany, Oceanside, California 
Let p, be the nth prime. Show that 


n, (242 
P<Pt-Pn p Pr 


for all prime p, or isolate the exceptional values. 


SOLUTIONS OF ADVANCED PROBLEMS 
Discontinuity in a Function with All Partial Derivatives 


5840 [1972, 187]. Proposed by Maury Horowitz, Nick Metas and Gerald 
Leibowitz, University of Connecticut 


Can one construct a real-valued function f whose domain is an open set U in R? 
such that f has all partial derivatives of all orders at every point in U yet there is some 
point in U at which f is not continuous? 


Solution by Wolfe Snow, Brooklyn College. Let 


exp (x"*y7*) 
f(x,y) = < exp(x~4) + exp(y~*) 


0 for xy = 0. 


for xy £0, 


Then f is not continuous at (0,0) since lim, .y.9 f(x, y) = 4. 

The only possible source of differentiation difficulty is when x or y is 0. This 
does not pose any problem, however, since for all derivatives the exponential in the 
denominator dominates the exponential in the numerator and any power terms that 
may arise, and consequently all derivatives are 0 when either x or y is 0. 

Thus, f has all partial derivatives of all orders at every point of R*, yet fis not 
continuous at (0, 0). 


Also solved by R. T. Baumel, J. M. Bell, R. L. Bishop, A. A. Blank, R. A. Christiansen, L. E. 
‘Clarke (England), S. Cullinane, J. Diederich, A. G. Dors, G. J. Foschini, G. Freilich, R. Katz, I. 
Korec (Yugoslavia), H.C. Kranzer, O. P. Lossers (Netherlands), M. Machover, L. Mattics, J. G. 
Mauldon, P. L. Montgomery, C. J. Neugebauer, S. Rajnak, J. Ratz, (Switzerland), H. Van Evelghem 
(Belgium), A. Weinmann (England), and A. C. Williams. 


Notes. Bishop notes an example on p. 20 of Bishop and Goldberg, Tensor Analysis on Manifolds. 
Dors cites a method of construction which is found in Appendix L of E. E. Moise, Calculus. Blank 
and V. Mizel note an example of a function with a discontinuity, yet having directional derivatives 
of all orders, 


566 ADVANCED PROBLEMS AND SOLUTIONS [May 


Fourier-Stieltjes Transform of a Continuous Measure 


5841 [1972, 187]. Proposed by L.-S. Hahn, University of New Mexico 


Is there a (complex) continuous measure (i.e., n(E) =0 if E is countable) on 
the real line, whose Fourier-Stieltjes transform has modulus 1 everywhere on the 
real line? 


Solution by G. M. Leibowitz, University of Connecticut. There is no such measure. 
We offer two proofs, one quantitative, the other qualitative. 
I. By a theorem of Wiener, if 4 € M(R), then 


I T 
Z| a({x})|? = tim te. [_ [aco [Pate 


(See Katznelson, An Introduction to Harmonic Analysis, p. 138.) So if @ has modulus 
1, 2 | u({x})|? = 1, and p is not continuous. (An analogous averaging process holds 
on any locally compact abelian group, yielding the same result.) 

II. Let pe M(G), G any LCA group. Set fi(E) = u(— E). Assume that | f| = 1. 
Then (u*fi)* = |A|? = 1=6, where 6 is the point mass at 0. Hence p*fi=6 by 
uniqueness of Fourier-Stieltjes transforms. The continuous measures form an ideal, 
so pz is not continuous. 

We could also quote Corollary 5.6.9(b) in Rudin, Fourier Analysis on Groups. 


Also solved by R. W. Chaney, S. H. Friedberg, C. C. Graham, D. Lind, N. X. Uy, and the 
proposer. 


Orthogonal Projections 


5842 [1972, 307]. Proposed by B. B. Winter, Eugene, Oregon 


Let T be a linear (not necessarily continuous) map of a Hilbert space H to itself. 
Suppose there exists a subset S such that TxeS and x — TxeS~ for all xeH. 
Show that S is a closed linear subspace and that T is the (necessarily continuous) 
orthogonal projection of H onto S. 


Solution by S. P. Gudder, University of Denver. If xéES, since TxeS and 
x — TxeS*, we have (x — Tx) L (x — Tx), so Tx = x. We thus see that xeS 
if and only if Tx = x. Hence T*y = T(Ty) = Ty for all y € H and Tis a projection, 
Now T is hermitian since 


(x, Ty) = (x—Tx) + Tx, Ty) = (Tx, Ty) = (Tx,(Ty—y) + y) = (Tx, y) 


for all x, y¢éH. Since a hermitian operator defined on all the space is self-adjoint, 
T is a self-adjoint projection and hence an orthogonal projection. Since S is the 
range of T, S is a closed subspace. 


Also solved by forty-two other contributors. 


1973] ADVANCED PROBLEMS AND SOLUTIONS 567 


A First Order Non-linear Differential Inequality 


5843 [1972, 307]. Proposed by N. P. Callas, Office of Scientific Research, 
U.S. Air Force 


Show that if o(x) 2 0 satisfies the nonlinear differential inequality 
o'(x) + b(x)o(x) S f(x) Lo)]}*, 


where o(a) = c and 0 Sa<1, then 


o(x) S exp( — [ ecoar] | J a-are exp( [ (1—epb(na) a 4 ote] 


Solution by D. G. Belanger, University of South Alabama. Assume that 
o(x)>0 on [a,x]. Divide the inequality by [o(x)]|* and make the substitution 
v(x) = (o(x))'~* obtaining 


1/(1-—«) 


1 
1—«a 


v’(x) + b(x)vo(x) S f(x). 


We now multiply by the integrating factor (1—«)exp({{(1—«a)b(t)dt) obtaining 


Hoop (Jal — DM) <(1—a)f(x)exp ([ a-sb@ae), 


Integrating? 


v(x) exp ( | “d — ayb()at] 
< [ [a-areexo( [a —cocoar)| dt + u(a), 


v(x) S exp ( — [ (d=a)b@de] 


{| C —«a) f(t) exp ([ (1 —a)b(e)de) | a + x(a} , 


Since v(x) = [o(x)]'~*, 


[a(x)]*"* < exp | (1 -a)b(edr) 


an | — 2 f()exp ( [ a-2)6@ar) dt + ctl ; 


a(x) S exp ( — [ ° b(a)de] 


568 REVIEWS [May 


(f" ja — x) f(t) exp ([ (1 =) J di+ af 


Also solved by J. E. Chance, F. A. Homann, S.J., A. A. Jagers (Netherlands), G. A. Kemper, 
Charlotte Krauthammer (Austria), J. R. Kuttler, Beatriz Margolis (Argentina), R. J. Schaar, J. S. 
Shipman, T. Teichmann, H.C. Wente, and the proposer. 

The original statement contained a misprint, as discovered by all contributors: as first printed, 


b(t) had an incorrect coefficient (1 — a). 
Discontinuities of Functions in R? 


5844 [1972, 307]. Proposed by L.-S. Hahn, University of New Mexico 


Construct a function defined everywhere in the plane which is nowhere continu- 
ous and yet is continuous in each variable separately, or prove such a function 
does not exist. 


Solution by G. M. Leibowitz, University of Connecticut. In volume one of 
E. W. Hobson, The Theory of Functions of a Real Variable, reprinted by Dover, 
1957, we see on p. 449 that if fis separately continuous in each variable, then f is 
continuous at points on each graph of a continuous function. Hence no such function 
exists. 


Also solved by Bruce Ferrero, O. P. Lossers (Netherlands), C. J. Neugebauer, T. Salat (Czecho- 
slovakia), and the proposer. 

Note. We are referred by Salat to F. W. Carroll, Separately continuous functions, this MONTHLY, 
V. 78 (1971), p. 175; and by Lossers to C. Goffman, Real Functions. The critical fact is that fis in 
the first Baire class. 


REVIEWS 


EDITED BY J. ARTHUR SEEBACH, JR. AND LYNN A. STEEN 
with the assistance of the mathematics departments of St. Olaf and Carleton Colleges 


COLLABORATING EDITOR FOR FILMS: SEYMOUR SCHUSTER, CARLETON COLLEGE 


Printed materials for review should be sent to: Book Review Editor, American Mathematical 
Monthly, St. Olaf College, Northfield, MN 55057. Films and correspondence relating to films 
should be sent to Seymour Schuster, Carleton College, Northfield MN 55057. 

All unsigned material is written by the editors. A boldface capital C in the margin indicates 
that a review is based in part on classroom use. Professors willing to write such a review should 
inform the editor in order to avoid duplication. 


Perspectives in Mathematics. By David E. Penney. Benjamin, Menlo Park, California, 
1972. xiv + 349 pp. $9.95. (Telegraphic Review, April 1972.) 
Mathematics in Civilization. By H. L. Resnikoff and R. O. Wells, Jr. Holt, Rinehart, 


THE AMERICAN 


MATHEMATICAL MONTHLY 


(FOUNDED IN 1894 BY BENJAMIN F. FINKEL) 
THE OFFICIAL JOURNAL OF 


THE MATHEMATICAL ASSOCIATION OF AMERICA 


VOLUME 80 NUMBER 6 
CODEN: AMMYAE 
CONTENTS 
A History of the Prime Number Theorem. . . . . . . .L. J. GOLDSTEIN 599 
Differentiation under the Integral Sign . . . . . . . HARLEY FLANDERS 615 
The Stanford University Competitive Examination in Mathematics . 

Ce G. POLYA AND J. KILPATRICK 627 
How to Classify Differential Polynomials . . . .. . «REUBEN HERSH 641 
The Cesaro Operators and their Generalizations: Examples in Infinite-Dimensional 

Linear Analysis . . . . . . . . . «) .) .)~)). GERALD LetBpowiTz 654 
A. A. Albert. 2. 2... 1 we eee.) WD ZELINSKY 661 
MATHEMATICAL NOTES 
Functions Satisfying a Mean Value Property at their Zeros. . D. P. STANFORD 665 
On an Extension of the Theorem of Hausdorff-Young . . . LIANG-SHIN HAHN 667 
A Characterization of the n x n Matrices over a Finite Field . 
; . J. V. BRAWLEY AND L. CaRLitz 670 
Another Proof of Bernstein’ S Theorem toe . . . PP. J. OHARA 673 
Addendum to “On the Diffeomorphisms of Buclidean Space” . . W. B. GorDON 674 
RESEARCH PROBLEMS 
How Unexpected is the Prime Number Theorem?. . . . . M.D.HIRSCHHORN 675 
CLASSROOM NOTES 
The Indecomposability of the Dyadic Solenoid . . . . . .S.B.NADLER,JR. 677 
The Differentiability Properties of Typical Functions in Cla, b]. .A.M.BRUCKNER 679 
Representing a Finite Borel Measure in Terms ofits Distribution Function J.J. HIGGINS 683 
(Continued on inside cover) 
JUNE-JULY 1973 


MATHEMATICAL EDUCATION 
Using Student-tutors in Precalculus Instruction T. A. EISENBERG AND J. B. BROWNE 685 


Economics as a Minor for Undergraduate Mathematics Majors. . .D. F. ELLis 688 
Survival for Mathematics Students! . . . . .. . . . B. B. HuGHEs 689 
ELEMENTARY PROBLEMS AND SOLUTIONS . . . . . eee ee ee HDI 
ADVANCED PROBLEMS AND SOLUTIONS. . . . . . ee eee ee G97 
REVIEWS 2... wee ee ee ee 702 
News AND NOTICES. . . . we eee ee ee TD 
MATHEMATICAL ASSOCIATION OF AMERICA... «eee eet tC 
February Meeting of the Northern California Section. . . oe.) 722 
November Meeting of the Maryland-District of Columbia-Virginia Section »~ 2. . . 722 
November Meeting of the Philadelphia Section. . . . . . . . . 723 
Calendars of Future Meetings. . . . . . . . . . ee ee TF 


NOTICE TO AUTHORS 


Specialized research is usually unsuitable; see Statement of Policy (vol. 76, p.2). Manuscript preparation: Please 
use the Manual for Monthly Authors (vol. 78, p. 1) and follow the format in current issues of the MONTHLY. 
Manuscripts should be typewritten, triple-spaced with wide margins; submit two copies and keep one for 
protection against loss. 

Backlog: Main Articles 12 months, Math. Notes 13 months, Research Problems 7 months, Classroom Notes 
11 months, Math. Education 10 months. 


EDITORIAL CORRESPONDENCE AND MAIN ARTICLES: to ALEX ROSENBERG, Department of Mathe- 
matics, Cornell University, Ithaca, N.Y. 14850; NOTES, etc.: to the corresponding Associate Editor; 
ADVERTISING CORRESPONDENCE: to RAouL HAILPERN, Mathematical Association of America, 
SUNY at Buffalo, Buffalo, N. Y. 14214; CHANGE OF ADDRESS and SUBSCRIPTIONS: to A. B. 
WILLCOX, Mathematical Association of America, 1225 Connecticut Ave., N.W., Washington, D.C. 20036. 


HARLEY FLANDERS, Editor 
ALEX ROSENBERG, Editor-Elect 
ASSOCIATE EDITORS 


JOSHUA BARLAZ J. G. HARVEY SEYMOUR SCHUSTER 
E.R. BERLEKAMP ERIC S. LANGFORD J. ARTHUR SEEBACH, Jr. 
JANE W. DI PAOLA P, D. LAX FE. P. STARKE 

ROBERT GILMER ARTHUR MATTUCK LYNN A. STEEN 
RICHARD GUY M. W. POWNALL JAMES WENDEL 

RAOUL HAILPERN GIAN-CARLO ROTA 


Annual dues for members of the Association (including a subscription to the American 
Mathematical Monthly) are $12.50. For nonmembers the subscription price is $18.00. 


PUBLISHED BY THE ASSOCIATION at Washington, D. C., and Menasha, Wisconsin, during the months of January, 
February, March, April, May, June-July, August-September, October, November, December. 


Second-class postage paid at Washington, D. C., and additional mailing offices. 
Copyright © The Mathematical Association of America (Incorporated), 1973 


PRINTED IN THE UNITED STATES OF AMERICA 


A HISTORY OF THE PRIME NUMBER THEOREM 
L. J. GOLDSTEIN, University of Maryland 


The sequence of prime numbers, which begins 
2,3,5, 7,11, 13, 17, 19, 23, 29, 31, 37, ---, 


has held untold fascination for mathematicians, both professionals and amateurs 
alike. The basic theorem which we shall discuss in this lecture is known as the prime 
number theorem and allows one to predict, at least in gross terms, the way in which 
the primes are distributed. Let x be a positive real number, and let z(x) = the number 
of primes <x. Then the prime number theorem asserts that 


mx) 
wow X/logx — 


(1) 


9 


where log x denotes the natural log of x. In other words, the prime number theorem 
asserts that 
x x 
log x +0 (=) 5 (x -_ oO), 
where o(x /log x) stands for a function f(x) with the property 
f)__ 


x-w x/logx 7 


(2) mx) = 


Actually, for reasons which will become clear later, it is much better to replace (1) 
and (2) by the following equivalent assertion: 


~*~ dy x 
® no) = J gory + (tops): 
To prove that (2) and (3) are equivalent, it suffices to integrate 
J, ios 
2 logy 


once by parts to get 

* dy x 2 * dy 
4 —— = — —— ——-, 
(4) [ logy logx log2 + [ log? y 


Larry Goldstein received his Princeton Ph.D. under G. Shimura. After a Gibbs instructorship 
at Yale, he joined the Univ. of Maryland as Associate Professor and now is Professor. His research 
is in Analytic and Algebraic Number Theory and Automorphic Functions. He is the author of Analy- 
tic Number Theory (Prentice-Hall 1971), and Abstract Algebra, A First Course (Prentice-Hall, to 
appear). Editor. 


599 


600 L. J. GOLDSTEIN [June-July 


However, for x 2 4, 


* dy dy * dy 
lost» > loozy ? | —Too2y 
2 log’*y 2 log*y J y= log?» 


1 
< ee 
(5) S V* log? 2 r 


| 

8 
—, 
BI x 
x 
“en 


where we have used the fact that 1 /log?x is monotone decreasing for x > 1. It is 
clear that (4) and (5) show that (2) and (3) are equivalent to one another. The advan- 
tage of the version (3) is that the function 


Li(x) = | Ay 
2 


called the logarithmic integral, provides a much closer numerical approximation to 
n(x) than does x /logx. This is a rather deep fact and we shall return to it. 

In this lecture, I should like to explore the history of the ideas which led up to the 
prime number theorem and to its proof, which was not supplied until some 100 years 
after the first conjecture was made. The history of the prime number theorem provides 
a beautiful example of the way in which great ideas develop and interrelate, feeding 
upon one another ultimately to yield a coherent theory which rather completely 
explains observed phenomena. 

The very conception of a prime number goes back to antiquity, although it is not 
possible to say precisely when the concept first was clearly formulated. However, a 
number of elementary facts concerning the primes were known to the Greeks. Let us 
cite three examples, all of which appear in Euclid: 

(1) (Fundamental Theorem of Arithmetic): Every positive integer n can be 
written as a product of primes. Moreover, this expression of n is unique up to a 
rearrangement of the factors. 

(11) There exist infinitely many primes. 

(iii) The primes may be effectively listed using the so-called ‘“‘sieve of 
Eratosthenes’’. 

We will not comment on (i), (iii) any further, since they are part of the curriculum 
of most undergraduate courses in number theory, and hence are probably familiar 
to most of you. However, there is a proof of (ii) which is quite different from Euclid’s 
well-known proof and which is very significant to the history of the prime number 
theorem. This proof is due to the Swiss mathematician Leonhard Euler and dates 
from the middle of the 18th century. It runs as follows: 

Assume that p,,°::, py is a complete list of all primes, and consider the product 


1973] A HISTORY OF THE PRIME NUMBER THEOREM 601 


N 1 —1 N 1 1 
(6) (1 -—) = (1+—-+54+-), 
i Pi i Pie OF 
Since every positive integer n can be written uniquely as a product of prime powers, 
every unit fraction 1/n appears in the formal expansion of the product (6). For 
example, if n = p{'---ph, then 1/n occurs from multiplying the terms 


1 /p'', 1 /py, a) 1 /py’. 


Therefore, if R is any positive integer, 


N 1\7! R 

(7) I (1 - —} > > tn. 
i=1 Di n=1 

However, as R— oo, the sum on the right hand side of (7) tends to infinity, which 
contradicts (7). Thus, p,,°::, py cannot be a complete list of all primes. We should 
make two comments about Euler’s proof: First, it links the Fundamental Theorem 
of Arithmetic with the infinitude of primes. Second, it uses an analytic fact, namely 
the divergence of the harmonic series, to conclude an arithmetic result. It is this 
latter feature which became the cornerstone upon which much of 19th century 
number theory was erected. 

The first published statement which came close to the prime number theorem 
was due to Legendre in 1798 [8]. He asserted that x(x) is of the form x /(A log x + B) 
for constants A and B. On the basis of numerical work, Legendre refined his con- 
jecture in 1808 [9] by asserting that 


XxX 


mx) = logx + A(x) ’ 


99 


where A(x) is ‘‘approximately 1.08366---’’. Presumably, by this latter statement, 


Legendre meant that 
lim A(x) = 1.08366. 


It is precisely in regard to A(x), where Legendre was in error, as we shall see below. 
In his memoir [9] of 1808, Legendre formulated another famous conjecture. Let k 
and | be integers which are relatively prime to one another. Then Legendre asserted 
that there exist infinitely many primes of the form /] + kn(n =0,1,2,3,-:-). In other 
words, if z,,(x) denotes the number of primes p of the form / + kn for which p S x, 
then Legendre conjectured that 


(8) Ty, (X) > 00 aS XO. 


Actually, the proof of (8) by Dirichlet in 1837 [2] provided several crucial ideas on 
how to approach the prime number theorem. 


602 L. J. GOLDSTEIN [June-July 


Although Legendre was the first person to publish a conjectural form of the 
prime number theorem, Gauss had already done extensive work on the theory of 
primes in 1792-3. Evidently Gauss considered the tabulation of primes as some sort 
of pastime and amused himself by compiling extensive tables on how the primes 
distribute themselves in various intervals of length 1009. We have included some of 
Gauss’ tabulations as an Appendix. The first table, excerpted from [3, p. 436], covers 
the primes from 1 to 50,000. Each entry in the table represents an interval of length 
1000. Thus, for example, there are 168 primes from 1 to 1000; 135 from 1001 to 2000; 
127 from 3001 to 4000; and so forth. Gauss suspected that the density with which 
primes occured in the neighborhood of the integer n was 1 /logn, so that the number 
of primes in the interval [a,b) should be approximately equal to 


[ dx 
, logx. 


In the second set of tables, samples from [4, pp. 442-3], Gauss investigates the 
distribution of primes up to 3,000,000 and compares the number of primes found 
with the above integral. The agreement is striking. For example, between 2,600,000 
and 2,700,000, Gauss found 6762 primes, whereas 


2,700,000 
| AX _ 6761332, 
2 


,600,000 log x 


Gauss never published his investigations on the distribution of primes. Never- 
theless, there is little reason to doubt Gauss’ claim that he first undertook his work 
in 1792-93, well before the memoir of Legendre was written. Indeed, there are 
several other known examples of results of the first rank which Gauss proved, but 
never communicated to anyone until years after the original work had been done. 
This was the case, for example, with the elliptic functions, where Gauss preceded 
Jacobi, and with Riemannian geometry, where Gauss anticipated Riemann. The only 
information beyond Gauss’ tables concerning Gauss’ work in the distribution of 
primes is contained in an 1849 letter to the astronomer Encke. We have included a 
translation of Gauss’ letter. 

In his letter Gauss describes his numerical experiments and his conjecture con- 
cerning (x). There are a number of remarkable features of Gauss’ letter. On the 
second page of the letter, Gauss compares his approximation to z(x), namely Li(x), 
with Legendre’s formula. The results are tabulated at the top of the second page 
and Gauss’ formula yields a much larger numerical error. Ina very prescient statement, 
Gauss defends his formula by noting that although Legendre’s formula yields a 
smaller error, the rate of increase of Legendre’s error term is much greater than his 
own. We shall see below that Gauss anticipated what is known as the ‘‘Riemann 
hypothesis.’’ Another feature of Gauss’ letter is that he casts doubt on Legendre’s 
assertion about A(x). He asserts that the numerical evidence does not support any 
conjecture about the limiting value of A(x). 


1973] A HISTORY OF THE PRIME NUMBER THEOREM 603 


Gauss’ calculations are awesome to contemplate, since they were done long 
before the days of high-speed computers. Gauss’ persistence is most impressive. 
However, Gauss’ tables are not error-free. My student, Edward Korn, has checked 
Gauss’ tables using an electronic computer and has found a number of errors. We 
include the corrected entries in an appendix. In spite of these (remarkably few) 
errors, Gauss’ calculations still provide overwhelming evidence in favor of the prime 
number theorem. Modern students of mathematics should take note of the great 
care with which data was compiled by such giants as Gauss. Conjectures in those 
days were rarely idle guesses. They were usually supported by piles of laboriously 
gathered evidence. 

The next step toward a proof of the prime number theorem was a step in a 
completely different direction, and was taken by Dirichlet in 1837 [2]. In a beautiful 
memoir, Dirichlet proved Legendre’s conjecture (8) concerning the infinitude of 
primes in an arithmetic progression. Dirichlet’s work contained two radically new 
ideas, which we should discuss in some detail. 

Let Z, denote the ring of residue classes modulo n, and let Z; denote the group 
of units of Z,. Then Z, is the so-called ‘‘group of reduced residue classes modulo n’’ 
and consists of those residue classes containing an element relatively prime to n. If k 
is an integer, let us denote by & its residue class modulo n. Dirichlet’s first brilliant 
idea was to introduce the characters of the group Z,;; that is, the homomorphisms of 
Z* into the multiplicative group C” of non-zero complex numbers. If y is such a 
character, then we may associate with x a function (also denoted y) from the semi- 
group Z* of non-zero integers as follows: Set 


x(a) = x(a) if (a,n) =1 
0 otherwise. 


Then it is clear that y: Z* > C” and has the following properties: 


(i) xla+n) = x(a), 

(ii) yaa’) = yx(a)x(a’), 

(iii) x(a) = 0 if (a,n) #1, 
(iv) x(1) = 1, 


A function y: Z* > C” satisfying (i)-(iv) is called a numerical character modulo n. 
Dirtchlet’s main result about such numerical characters was the so-called orthogonal- 
ity relations, which assert the following: 


(A) x x(a) = @(n) if x is identically 1, 
0 otherwise, 


where a runs over a complete system of residues modulo n; 


604 L. J. GOLDSTEIN (June-July 


(B) x x(a) = d(n) if a =1 (mod n), 
0 otherwise, 


where x runs over all numerical characters modulo n. Dirichlet’s ideas gave birth to 
the modern theory of duality on locally compact abelian groups. 

Dirichlet’s second great idea was to associate to each numerical character modulo n 
and each real number s > 1, the following infinite series 


x(n) . 


(9) Us) = % © 


It is clear that the series converges absolutely and represents a continuous function 
for s > 1. However, a more delicate analysis shows that the series (9) converges 
(although not absolutely) for s > 0 and represents a continuous function of s in this 
semi-infinite interval provided that y is not identically 1. The function L(s, v) has 
come to be called a Dirichlet L-function. 

Note the following facts about L(s, y): First L(s, x) has a product formula of the 
form 


(10) L(s,x) = [] ( — a) (s > 1), 


where the product is taken over all primes p. The proof of (10) is very similar to the 
argument given above in Euler’s proof of the infinity of prime numbers. Therefore, 


by (10), 


logL(s,y) = - & log (1 - #2) 
(11) p of 
~s y 


ms * 
m=1 mp 


Dirichlet’s idea in proving the infinitude of primes in the arithmetic progression 
a, a+n,a+2n,-:::,(a,n)=1, was to imitate, somehow, Euler’s proof of the in- 
finitude of primes, by studying the function L(s, y) for s near 1. The basic quantity to 
consider is 


00 -1 m 
EZ y(a)logl(s, = - LL y MA KE) 
(12) x p m=1 x mp 
— __ _ it -1 m 
= Be nw Ha)", 


where we have used (11). Let a* be an integer such that aa* =1 (mod n). Then 
y(a*) = y(a)~* by (i)(iv). Moreover, 


1973] A HISTORY OF THE PRIME NUMBER THEOREM 605 
x x(a)y~*yx(p") =  x(a*p”) 
x x 


@(n) if a*p"™ = 1 (modn) 


(13) 


QO otherwise. 


However, a*p™ = 1 (mod n) is equivalent to p” = a (mod n). Therefore, by (12) and 


(13), we have 
in 1 
SE y—,. 
Dp m=1 mp 
p™=a(mod n) 


(14) X x(a)~* log Lis, x) = — on) 


Thus, finally, we have 


1 < 1 
~ 1 y yaogus,y- ES Ye 
gin) 7 x(a) log Lis, x) aD mpm 

(15) p™=a(modn) 
1 
= py ; s> 1). 
: D ( ) 
p=a(modn) 


From (15), we immediately see that in order to prove that there are infinitely many 
primes p = a (mod n), it is enough to show that the function 


1 
as 
p=a(mod n) Pp 
tends to + co as s approaches 1 from the right. But it is fairly easy to see that as 
s+i1+, the sum 
os 1 
pa yy --— 


p m= 2 mp 
p™=a(modn) 


remains bounded. Thus, it suffices to show that 


1 
— —1ij —> —_ . 
b(n) > x(a)~* log L{s, x) > + 00 (s>1+) 


However, if y, denotes the character which is identically 1, then it is easy to see that 


1 
— -—— y,(a)7!L(s,7)) > + 0 as sol. 
o(n) Xol ) ( Xo) 
Therefore, it is enough to show that if y # yo, then log L(s, 7) remains bounded as 
s—+1+. We have already mentioned that L(s, x) is continuous for s> 0 if x # %. 
Therefore, it suffices to show that L(1, x7) #0. And this is precisely what Dirichlet 
showed. 


606 L. J. GOLDSTEIN [June-July 


Dirichlet’s theorem on primes in arithmetic progressions was one of the major 
achievements of 19th century mathematics, because it introduced a fertile new idea 
into number theory —that analytic methods (in this case the study of the Dirichlet 
L-series) could be fruitfully applied to arithmetic problems (in this case the problem 
of primes in arithmetic progressions). To the novice, such an application of analysis 
to number theory would seem to be a waste of time. After all, number theory is the 
study of the discrete, whereas analysis is the study of the continuous; and what 
should one have to do with the other! However, Dirichlet’s 1837 paper was the 
beginning of a revolution in number-theoretic thought, the substance of which was to 
apply analysis to number theory. At first, undoubtedly, mathematicians were very 
uncomfortable with Dirichlet’s ideas. They regarded them as very clever devices, 
which would eventually be supplanted by completely arithmetic ideas. For although 
analysis might be useful in proving results about the integers, surely the analytic 
tools were not intrinsic. Rather, they entered the theory of the integers in an inessential 
way and could be eliminated by the use of suitably sophisticated arithmetic. However, 
the history of number theory in the 19th century shows that this idea was eventually 
repudiated and the rightful connection between analysis and number theory came to 
be recognized. 

The first major progress toward a proof of the prime number theorem after 
Dirichlet was due to the Russian mathematician Tchebycheff in two memoirs [12,13] 
written in 1851 and 1852. Tchebycheff introduced the following two functions of a 
real variable x: 


Ox) = 2» logp, 
psx 

W(x) = L logp, 
psx 


where p runs over primes and m over positive integers. Tchebycheff proved that the 
prime number theorem (1) is equivalent to either of the two statements 


(16) tim 2) = y 
(17) tim YOO Ly 


Moreover, Tchebycheff proved that if lim,_, ,, (0(x) /x) exists, then its value must be 1. 
Furthermore, Tchebycheff proved that 
(x) 


(18) 92129 < lim inf ———- < 1 Slim sup Tx) < 1.10555. 
row  X/logx xox X/logx 


Tchebycheff’s methods were of an elementary, combinatorial nature, and as such 
were not powerful enough to prove the prime number theorem. 


1973] A HISTORY OF THE PRIME NUMBER THEOREM 607 


The first giant strides toward a proof of the prime number theory were taken by 
B. Riemann in a memoir [10] written in 1860. Riemann followed Dirichlet in con- 
necting problems of an arithmetic nature with the properties of a function of a 
continuous variable. However, where Dirichlet considered the functions L(s, y) as 
functions of a real variable s, Riemann took the decisive step in connecting arithmetic 
with the theory of functions of a complex variable. Riemann introduced the following 
function: | 


(19) (s)= L —;, 


which has come to be known as the Riemann zeta function. It is reasonably easy to 
see that the series (19) converges absolutely and uniformly for s in a compact subset 
of the half-plane Re(s) > 1. Thus, ¢(s) is analytic for Re(s) > 1. Moreover, by using 
the same sort of argument used in Euler’s proof of the infinitude of primes, it is easy 
to prove that 


—1 
(20) cs) = T] (1-=) Re(> 0) 
p 

where the product is extended over all primes p. Euler’s proof of the infinitude of 
primes suggests that the behavior of ¢(s) for s = 1 is somehow connected with the 
distribution of primes. And, indeed, this is the case. 

Riemann proved that C(s) can be analytically continued to a function which is 
meromorphic in the whole s-plane. The only singularity of ¢(s) occurs at s = 1 and 
the Laurent series about s = 1 looks like 


(21) C(s) = 5 tay taj(s— 1) + 


Moreover, if we set 

(22) R(s) = s(s— 1) *T(s/2)E(8), 

then R(s) is an entire function of s and satisfies the functional equation 
(23) R(s) = RU — s). 


To see the immediate connection between the Riemann zeta function and the 
distribution of primes, let us return to Euler’s proof of the infinitude of primes. 
A variation on the idea of Euler’s proof is as follows: Suppose that there were only 
finitely many primes p,,°::, py. Then by (20), {(s) would be bounded as s tends to 1, 
which contradicts equation (21). Thus, the presence of a pole of C(s) at s=1 im- 
mediately implies that there are infinitely many primes. But the connection between 
the zeta function and the distribution of primes runs even deeper. 


608 L. J. GOLDSTEIN [June-July 


Let us consider the following heuristic argument: From equation (20), it is easy 
to deduce that 


(24) (s) = » 2 (logp)p ™ (Re (s) > 1). 
C(s) p m=1 
Moreover, by residue calculus, it is easy to verify that 
1 f?*'7 a 1,x<1 
lim —— —ds= 
(25) no 2m [ s ds 0,x>1. 


Therefore, assuming that interchange of limit and summation is justified, we see that 
for x not equal to an integer, we have 


2+iT 00 2+iT s 
lim | xX EO as = DY DY (logy) lim 1 | (=) - as 
2 S po om=1 2 


Too 2M Jo-ir C(s) T+0 201 Jo-ir \ Dp s 
(26) = ) logp (by equation (25)) 
p™sx 
= W(x). 


Thus, we see that there is an intimate connection between the function w(x) and C(s). 
This connection was first exploited by Riemann, in his 1860 paper. 

Note that the function 

x* ¢'(s) 
27 — —— 
e) 5 fs) 
has poles at s = 0 and at all zeroes p of {(s). Moreover, note that by equation (20), 
we see that C(s) # 0 for Re(s) > 1. Therefore, all zeroes of C(s) lie in the half-plane 
Re(s) $1. Further, since R(s) is entire and ¢(s) #0 for Re(s) > 1, the functional 
equation (23) implies that the only zeroes of {(s) for which Re(s) <0 are at 
s= —2, —4, — 6, — 8,---, and these are all simple zeroes and are called the trivial 
zeroes of ((s). Thus, we have shown that all non-trivial zeroes of C(s) lie in the strip 
0 < Re(s) <1. This strip is called the critical strip. The residue of (27) at a non- 
trivial zero p is 
xP 


p 


Thus, if o is a large negative number, and if C,,,; denotes the rectangle with vertices 
ao +iT, 2 t+ iT, then Cauchy’s theorem implies that 


1 2+iT xs oS) 4 | o+iT 2+iT o i ¢'(s) 
C8) ai 2ni i, s One ~ Oni I. +| +f s &(s) ) 5 5 + RD), 


where R(a, T) denotes the sum of the residues of the function (27) at the poles inside 


1973] A HISTORY OF THE PRIME NUMBER THEOREM 609 


C,,r- By letting o and T tend to infinity and by applying equations (26) and (28), 
Riemann arrived at the following remarkable formula, known as Riemann’s explicit 
formula 


(29) Woy ax DEO Nog —x-2), 


where p runs over all non-trivial zeroes of the Riemann zeta function. Riemann’s 
formula is surprising for at least two reasons. First, it connects the function w(x), 
which is connected with the distribution of primes, with the distribution of the 
zeroes of the Riemann zeta function. That there should be any connection at all is 
amazing. But, secondly, the formula (29) explicitly puts in evidence a form of the 
prime number theorem by equating w(x) with x plus an error term which depends 
on the zeroes of the zeta function. If we denote this error term by E(x), then we see 
that the prime number theorem is equivalent to the assertion 


, E(x) _ 
(30) lim > = 0, 


x7 


which, in turn, is equivalent to the assertion 
1 
(31) lim — » 


Riemann was unable to prove (31), but he made a number of conjectures concerning 
the distributions of the zeroes p from which the statement (31) follows immediately. 
The most famous of Riemann’s conjectures is the so-called Riemann hypothesis, 
which asserts that all non-trivial zeroes of €(s) lie on the line Re(s) = 4, which is the 
line of symmetry of the functional equation (23). This conjecture has resisted all 
attempts to prove it for more than a century and is one of the most celebrated open 
problems in all of mathematics. However, if the Riemann hypothesis is true, then 
| x 


p 


3 1 


= x*— 


| p 


and from this fact and equation (29), it is possible to prove that 
(32) W(x) =x + O(x?**) 


for every ¢ > 0, where O(x***) denotes a function f(x) such that f(x) /x?** is bounded 
for all large x. Thus, the Riemann hypothesis implies (31) in a trivial way, and hence 
the prime number theorem follows from the Riemann hypothesis. What is perhaps 
more striking is the fact that if (32) holds then the Riemann hypothesis is true. Thus, 
the prime number theorem in the sharp form (32) is equivalent to the Riemann 
hypothesis. We see, therefore, that the connection between the zeta function and the 


610 L. J. GOLDSTEIN [June-July 


distribution of primes is no accidental affair, but somehow is woven into the fabric of 
nature. 

In his memoir, Riemann made many other conjectures. For example, if N(T) 
denotes the number of non-trivial zeroes p of €(s) such that — T<Im(p) ST, then 
Riemann conjectured that 


1 ] 
(33) N(T) = —-T logT— 1 + log@z) 


a x T + O(log T). 


The formula (33) was first proven by von-Mangoldt in 1895 [14]. An interesting line 
of research has been involved in obtaining estimates for the number of non-trivial 
zeroes p on the line Re(s) = 4. Let M(T) denote the number of p such that Re(s) 
= 14,— T< Im(s) ST. Then Hardy [6] in 1912, proved that M(T) tends to infinity as 
T tends to infinity. Later, Hardy [7] improved his argument to prove that M(T) > AT, 
where A is a positive constant, not depending on T. The ultimate result of this sort 
was obtained by Atle Selberg in 1943 [11]. He proved that M(T) > ATlog T for some 
positive constant A. In view of equation (33), Selberg’s result shows that a positive per- 
centage of the zeroes of ((s) actually lie on the line Re(s) = 4. This result represents 
the best progress made to date in attempting to prove the Riemann hypothesis. 

Fortunately, it is not necessary to prove the Riemann hypothesis in order to 
prove the prime number theorem in the form (17). However, it is necessary to obtain 
some information about the distribution of the zeroes of ((s). Such information was 
obtained independently by Hadamard [5] and de la Vallée Poussin [1] in 1896, 
thereby providing the first complete proofs of the prime number theorem. Although 
their proofs differ in detail, they both establish the existence of a zero-free region for 
C(s), the existence of which serves as a substitute for the Riemann hypotheses in the 
reasoning presented above. More specifically, they proved that there exist constants 
a,ty such that ((¢ + it) #0 if o21-—1/a log |t|, | t| to. This zero-free region 
allows one to prove the prime number theorem in the form 


(34) W(x) = x + O(xe (8), 


Please note, however, that the error term in (34) is much larger than the error term 
predicted by the Riemann hypothesis. 

Thus, the prime number theorem was finally proved after a century of hard work 
by many of the world’s best mathematicians. It is grossly unfair to attribute proof of 
such a theorem to the genius of a single individual. For, as we have seen, each step in 
the direction of a proof was conditioned historically by the work of preceding 
generations. On the other hand, to deny that there is genius in the work which led 
up to the ultimate proof would be equally unfair. For at each step in the chain of 
discovery, brilliant and fertile ideas were discovered, and provided the material out 
of which to fashion the next link. 


1973] 


A HISTORY OF THE PRIME NUMBER THEOREM 


APPENDIX A: Samples from Gauss’ Tables. Taste 1 (Werke, IT, p. 436) 


feowwh perme enh pene pen per pene eek peeeh eoeh 
Oo On NA WA fhWN RW OO OI HWA A PW PN 


NON NN NN LO 
A BR WN ew © 


168 
135 
127 
120 
119 
114 
117 
107 
110 
112 
106 
103 
109 
105 
102 
108 

98 
104 

94 
102 

98 
104 
100 
104 

94 


26 
27 
28 
29 
30 
31 
32 
33 
34 
35 
36 
37 
38 
39 
40 
41 
42 
43 
44 
45 
46 
47 
48 
49 
50 


98 
101 
94 
98 
92 
95 
92 
106 
100 
94 
92 
99 
94 
90 
96 
88 
101 
102 
85 
96 
86 
90 
95 
89 
98 


The frequency of primes. TABLE 2 (Werke, II, p. 443) 2000000- - -3000000 


Te en ee ee ee ee ee 
“TIANA AWN MRM Ow WOr~AITIA NA hh WN © 


210 220 230 #240 250 
3 2 2 4 1 
10 9 9 11 9 
32 27 29 32 37 
69 69 73 86 78 
119 146 138 136 147 
197 183 179 176 193 
204 4201 4 4«=6205~=— «194 189 
157 168 168 158 151 
115 109 113 112 102 
63 52 44 55 58 
“21 18 30 28 23 
8 9 10 7 7 
2 4 1 5 

3 
6874 6857 6849 6787 6766 


260 


6804 


270 


6762 


611 


280 290 300 
j 
2 2 2 25 
7 15 13 98 
43 30 44 337 
95 85 64 778 
135 140 153 1408 
195 179 187 1878 
188 222 214 1998 
145 132 134 1525 
87 109 103 1034 
67 33 58 561 
24 18 15 223 
9 8 11 99 
2 5 j 27 
2 6 
j j 
j j 
6714 6744 6705 6862 


612 L. J. GOLDSTEIN {June-July 


APPENDIX B: Gauss’ Letter to Enke. 


My distinguished friend: 

Your remarks concerning the frequency of primes were of interest to me in more ways than one. 
You have reminded me of my own endeavors in this field which began in the very distant past, in 
1792 or 1793, after I had acquired the Lambert supplements to the logarithmic tables. Even before 
I had begun my more detailed investigations into higher arithmetic, one of my first projects was 
to turn my attention to the decreasing frequency of primes, to which end I counted the primes in 
several chiliads (intervals of length 1000; Trans.) and recorded the results on the attached white 
pages. I soon recognized that behind all of its fluctuations, this frequency is on the average inversely 
proportional! to the logarithm, so that the number of primes below a given bound x is approximately 


equal to 
dn 
logn’ 

where the logarithm is understood to be hyperbolic. Later on, when I became acquainted with the 
list in Vega’s tables (1796) going up to 400031, I extended my computation further, confirming that 
estimate. In 1811, the appearance of Chernau’s cribrum gave me much pleasure and I have frequently 
(since I lack the patience for a continuous count) spent an idle quart2r of an hour to count another 
chiliad here and there; although I eventually gave it up without quite getting through a million. 
Only some time later did I make use of the diligence of Goldschmidt to fill some of the remaining 
gaps in the first million and to continue the computation according to Burkhardt’s tables. Thus (for 
many years now) the first three million have been counted and checked against the integral. A 


small excerpt follows: 


TABLE A 

Integral 
Below Here are { dn Your Error 

Prime Error Formula 
log n 

500000 41556 416064 + 50.4 41596.9 + 40.9 
1000000 78501 79627.5 + 126.5 78672.7 -+ 171.7 
1500000 114112 114263.1 + 151.1 114374.0 + 264.0 
2000000 148883 149054.8 + 171.8 149233.0 + 350.0 
2500000 183016 183245.0 + 229.0 183495.1 + 479.1 
3000000 216745 216970.6 +. 225.6 217308.5 + 563.5 


I was not aware that Legendre had also worked on this subject; your letter caused me to look 
in his Théorie des Nombres, and in the second edition I found a few pages on the subject which I 


must have previously overlooked (or, by now, forgotten). Legendre used the formula 


where A is a constant which he sets equal to 1.08366. After a hasty computation, I find in the above 
cases the deviations 


— 
log n — A’ 


1973} A HISTORY OF THE PRIME NUMBER THEOREM 613 


TABLE B 


— 23,3 
+ 42,2 
+ 68,1 
+ 92,8 
+159, 
+167,6 


These differences are even smaller than those from the integral, but they seem to grow faster with n 
so that it is quite possible they may surpass them. To make the count and the formula agree, one 
would have to use, respectively, instead of A = 1.08366, the following numbers: 


TABLE C 


1,09040 
1,07682 
1,07582 
1,07529 
1,07179 
1,07297 


It appears that, with increasing n, the (average) value of A decreases; however, I dare not conjec- 
ture whether the limit as m approaches infinity is 1 or a number different from 1. I cannot say that 
there is any justification for expecting a very simple limiting value; on the other hand, the excess of 
A over 1 might well be a quantity of the order of 1/log n. I would be inclined to believe that the differ- 
ential of the function must be simpler than the function itself. 

If dn/log n is postulated for he function, Legendre’s formula would suggest that the differential 
function might be something of the form dn/ (log n—(A—1)). By the way, for large n, your for- 
mula could be considered to coincide with 


n 
logn — (1 /2k) ’ 


where k is the modulus of Brigg’s logarithms; that is, with Legendre’s formula, if we put A= 1/2k 


= 1.1513. 
Finally, I want to remark that I noticed a couple of disagreements between your counts and mine. 


Between 59000 and 60000, you have 95, while I have 94 
101000 102000 94 93. 


The first difference possibly results from the fact that, in Lambert’s Supplement, the prime 59023 
occurs twice, The chiliad from 101000 — 102000 in Lambert’s Supplement is virtually crawling with 
errors; in my copy, I have indicated seven numbers which are not primes at all, and supplied two 
missing ones. Would it not be possible to induce young Mr. Dase to count the primes in the following 
(few) millions, using the tables at the Academy which, I am afraid, are not intended for public dis- 
tribution? In this case, let me remark that in the 2nd and 3rd million, the count is, according to my 
instructions, based on a special scheme which I myself have employed in counting the first million. 
The counts for each 100000 are indicated on a single page in 10 columns, each column belonging 
to one myriad (an interval of length 10000; Trans.); an additional column in front (left) and another 
column following it on the right; for example here is a vertical column and the two additional columns 
for the interval 10000000 to 11000000 —- — — 


614 L. J. GOLDSTEIN [June-July 


As an illustration, take thé first vertical column. In the myriad 1000000 to 1010000 there are 100 
Hecatontades; (intervals of length 100; Trans.) among them one containing a single prime, none con- 
taining two or three primes; two containing four each; eleven containing 5 each, etc., yielding alto- 
gether 752= 1.1 + 4.2 + 5.11 + 6.14 + --- primes. The last column contains the totals from the 
other ten. The numbers 14, 15, 16 in the first vertical column are superfluous since no hecatontades 
occur containing that many primes; but on the following pages they are needed. Finally the 10 
pages are again combined into one and thus comprise the entire second million. 


It is high time to quit — ——. With most cordial wishes for your good health 
Yours, as ever, 
C. F. Gauss 


Gottingen, 24 December 1849, 


APPENDIX C: Corrections to Gauss’ Tables 


THOUSANDS GAUSS ACTUAL A 
20 102 104 —2 
159 87 77 4-10 
199 96 86 +10 
206 85 83 +2 
245 78 88 —10 
289 85 77 +8 
290 84 85 —1 
334 80 81 —] 
352 80 81 —] 
354 79 76 +3 
500 Up TO HERE +18 

TOTALS A 

500,000 41,556 41,538 +18 
1,000,000 78,501 78,498* +3 
1,500,000 114,112 114,156* —44 
2,000,000 148,883 148,934* —51 
2,500,000 183,016 183,073* —57 
3,000,000 216,745 216,817* -72 


* from List of Prime Numbers from 1 to 10,006,771, by D. N. Lehmer, (adjusted: He counts 1 as 
a prime). 


Research supported by NSF Grant GP 31280X. This article was presented to the History of Mathe- 
matics Seminar at the University of Maryland on March 20, 1972. The author wishes to thank Pro- 
fessor Gertrude Ehrlich for preparing the translation of Gauss’ letter which appears in Appendix B. 
He also wishes to thank Mr. Edward Korn for providing the calculations of Appendix C. 


References 


1. Ch. dela Vallée Poussin, Recherches analytiques sur la théorie des nombres premiers. Premiére 
partie. La fonction ¢(s) de Riemann et les nombres premiers en général. Deuxiéme partie: Les fonc- 


1973] DIFFERENTIATION UNDER THE INTEGRAL SIGN 615 


tions de Dirichlet et les nombres premiers de la forme linéaire Mx-+N. Troisiéme partie: Les formes 
quadratiques de déterminant négatif, Ann. Soc. Sci. Bruxelles, 20 (1896) 183-256, 281-397. 

2. L. Dirichlet, Uber den Satz: das jede unbegrenzte arithmetische Progression, deren erstes 
Glied und Differenz keinen gemeinschaftlichen Factor sind, unendlichen viele Primzahlen enthalt, 
1837; Mathematische Abhandlungen, Bd. 1, (1889) 313-342. 

3. C. F. Gauss, Tafel der Frequenz der Primzahlen, Werke, II (1872) 436-442. 

4, —-_——.,, Gauss an Enke, Werke, II (1872) 444-447. 

5. J. Hadamard, Sur la distribution des zéros de la fonction ¢(s) et ses conséquences arithméti- 
ques, Bull. Soc. Math. de France, 24 (1896) 199-220. 

6. G. H. Hardy, Sur les zéros de la fonction ¢(s) de Riemann, Comptes Rendus, 158 (1914) 
1012-14. 

7. , and J. E. Littlewood, The zeros of Riemann’s zeta function on the critical line, Math. 
Zeit., 10 (1921) 283-317. 

8. A. M. Legendre, Essai sur la théorie de Nombres, Ist ed., 1798, Paris, p. 19. 

9, , Essai sur la Théorie de Nombres, 2nd ed. 1808, Paris, p. 394. 

10. B. Riemann, Uber die Anzahl der Primzahlen unter einer gegebenen Grosse, Gesammelte 
Mathematische Werke, 2nd Aufl., (1892) 145-155. 

11. A. Selberg, On the zeros of Riemann’s zeta function, Skr. Norske Vid. Akad., Oslo (1942) 
no. 10. 

12. P. Tchebycheff, Sur la fonction qui détermine la totalité de nombres premiers inférieurs 
a une limite donnée, Oeuvres, 1 (1899) 27-48. 

13. —, Mémoire sur les nombres premiers, Oeuvres, 1 (1899) 49-70. 

14. H. von Mangoldt, Auszug aus einer Arbeit unter dem Titel: Zu Riemann’s Abhandlung 
iiber die Anzahl der Primzahlen unter einer gegebenen Grosse, Sitz. K6nig. Preus. Akad. Wiss. 
zu Berlin, (1894) 337-350, 883-895. 


DIFFERENTIATION UNDER THE INTEGRAL SIGN* 
HARLEY FLANDERS, Tel-Aviv University 


1. Introduction. Everyone knows the Leibniz rule for differentiating an integral: 


d h(t) F 
—— F(x, t) dx 
dt (J, O58) 


h(t) 
— [FLAC IAC — Ff g(t), Jeo 4 | OF (x, t) 


g(t) 0 


(1.1) 
dx. 


We are all fond of this formula, although it is seldom if ever used in such generality. 
Usually, either the limits are constants, or the integrand is independent of the time 
t. Frequent cases are 
d f' d t* ° OF (x,t) 
Th F(x) dx = F(t), rs [ F(x, t)dx = [ —ay a. 


a 


* Presented May 5, 1972 to the Rocky Mountain Section meeting, Southern Colorado State 
College, Pueblo, CO. 


616 HARLEY FLANDERS [June-July 


One proof runs as follows, modulo precisely stated hypotheses and some analytic 
details. Set 


(1.2) O(u,v,t) = [ Fo.nax, 


u = g(t), and v = h(t). By the chain rule 


em, o® 6® 
O[4(0) h(t), 1=(5, g+ Soh) + 


The first two terms are bracketed because they measure all changes due to variation 
of the interval of integration [g(t),h(t)], and they are evaluated by applying the 
Fundamental Theorem to (1.2). The third term measures change due to variation of 
the integrand. If enough smoothness is assumed to justify interchange of the inte- 
gration and differentiation operators, then 


o ar f° F480) 


We shall discuss generalizations of the Leibniz rule to more than one dimension. 
Such generalizations seem to be common knowledge among physicists, some dif- 
ferential geometers, and applied mathematicians who work in continuum mechanics, 
but are virtually unheard of among most mathematicians. I cannot find a single 
mention of such formulas in the current advanced calculus and several variable 
texts, except for Loomis and Sternberg [4]. 


REMARK: A nice approach to (1.3) is via interchange of the order of integration 
(Fubini’s Theorem): 


" OF(x,t) 7 oS) 8) 
[ae = ae fad, 
v t 
=| ax | OF(X,8) | 
t Jy a OSs 


am [ | F(x, t) — F(x, a)] dx 


d » 
“dt F(x, t) dx. 


au 


See for example Fleming [3] for details. 


2. Another proof. We shall concentrate on the change due to variation of the 
interval. This puts us in the proper frame of mind for generalization to more dimen- 
sions, where the real difficulties are with the moving domain, not with the time- 


1973] DIFFERENTIATION UNDER THE INTEGRAL SIGN 617 


varying integrand. Anyhow, we know how to separate the domain variation from the 
integrand variation by the chain rule device used above. 
Thus we are concentrating on 


h(t) 
a F(x) dx. 


The domain of integration, the interval C, = [g(t), h(t)] is moving with time, but 
we have no idea how points interior to the domain move. Only the motion of the 
boundary points has been prescribed; no one said anything about interior points! 
Even though we know in advance that the answer is independent of how the 
interior points may move, we shall stubbornly insist that they have a definite motion. 
Imagine the interval C, is a worm crawling along the x-axis. As it stretches and 
shrinks and does the things worms do, each point of its body follows some irregular 
trajectory. Suppose initially the worm’s points are labeled u, where a < u < b, and 
at time ¢, the point initially at u is at x = x(u,t). Now, a worm can only shrink so 
much, so 0x /du > 0. For each t, the map u > x(u, t) is smooth one-one with smooth 
(continuously differentiable) inverse. We might write @, for this map at t: 


P,(u) = x(u, t), 
,: [a,b] >[4,(4), $,(b)] = [g@, 4M] = C,. 
By the formula for change of variable in a simple integral, 
h(t) e(b) b ax 
| F(x) dx -| F(x) dx =| F[x(u,t)| du. 
g(t) $.(a) a ou 


This transition is excellent, because it has changed the integral over a moving domain 
to one over a fixed domain. We pay for this fixed domain with a time-varying inte- 
grand. No matter, we like it; we thrive on differentiation under the integral sign: 


d ho d f? Ox 
hi - F(x) dx = a |. F[x(u, t)] a, du 
> @ Ox 
= [ ar [Fixu, 0) a. du 


b 2 

; Ox Ox 0°x 
=| [Pow 9) SS + Flew o] Sah au 
The fixed domain has done its job, and we return to the moving domain. The in- 
stantaneous velocity is v = v(u,t) = dx /dt, which we also consider as a function of x 
and t¢ via the transformation (u, t)<> (x,t). When t is fixed, 


ax _ OD oe | oe ox _ oy Ox 
dudt Ou \du] du) Ou ax du’ 


618 HARLEY FLANDERS [June—July 


g r(b) ; Ov F 
[... [F (x)v + F(x) a x 


S| 0 
—, 
I 


[O° Zee) ax =f" Strela 
—— | F(x)v x= | ——| F(x)v | dx, 
belay OX g(x) Ox 

by the change of variable formula in reverse gear. Note that the time t is fixed in this 
process; the whole integration takes place instantaneously. 

We pause momentarily to inspect our progress. The derivative in question has 
b2en 2xpressed as an integral over the moving domain. The integrand depends on the 
velocity v at each point of the domain, but it just happens that the integrand is an 
exact Jerivative, so the answer depends only on the boundary values. At the boundary 
points g(t) and f(t), the velocities are g(t) and /(t) respectively, so finally 


h(t) 
fede = F[A(t)]A(t) — FL g(t) ]8(0). 


This might seem a silly approach to the problem because (1) it introduces an 
unnecessary quantity v, and (2) it evades using the fundamental theorem initially, 
only to use it in the end after all. Yet there is an essential idea here, reduction to a 
fixed domain, and it wins the day when we generalize. 


3. A plane formula. Imagine a moving domain D, in the x, y-plane (Fig. 1). 


Fic. 1 


We are also given a function F(x, y,t). The problem is to find 


d 
a ff Foun dx dy. 


Already the ugly method of the last section is looking better, because it is not im- 
mediately clear what replaces the two terms in (1.1) that resulted one way or the 
other from use of the fundamental theorem. Actually, on second thought, the fund- 
amental theorem just may prove relevant, but in its two dimensional form, viz., 
Green’s Theorem. 


1973] 


DIFFERENTIATION UNDER THE INTEGRAL SIGN 


Certainly our first move should be separation of boundary variation from in- 
and results in 


619 
tegrand variation. This is easy enough by the chain rule device in the first section 
d 


an az [[, Pow daxdy L 


— oa F(x, ys to) dx dy | 4 (| 
dt D; t=t9o0 JdD 


This is routine. The essence of the problem is to find 


d 
at [| Fo y) dx dy. 


OF 
. On . dx dy. 
This we shall do by a physicist’s argument. 


Look at two successive domains D, and D,,,,. See Fig. 2. 


D 
cao N, J tide 
N 

N 


D, 


\ 
\ 


\ 
\ 
NS 
S 
/ 
/ 


Fic. 2. 


Veni 


Let v = v(x, y,t) denote the velocity vector at a boundary point (x, y) of D, and let n 
denote the outward unit normal. In the difference 
[| Fos»axay — [] Fo. yyaxdy, 
De+de Dt 
everything in the overlap of D, and D,,,, cancels; only the thin boundary strip makes 
a contribution. From the detail, this contribution is 


F(x, y) (vdt) - (nds) 


up to higher order differentials, where ds is the element of arc length. (Disclaimer: 
I said it’s a physicist’s proof!) Hence 


~ 
ww! 


aH.” Uh. 


| F(x, y)v nds, 
OD¢e 


where 6 denotes boundary. Before taking limits, we compute v-n ds. We rotate the 


620 HARLEY FLANDERS [June—July 


unit tangent (dx /ds, dy/ds) backwards through a right angle to obtain n = (dy/ds, 
— dx /ds), hence 


v-nds = (u,v): (dy, —dx) = udy — vdx. 
Therefore 


(3.2) a {| F(x, y)dxdy - | F(x, y)(udy — vdx). 
dt JJp, aD: 
We can transform the boundary integral into an integral over D, by Green’s Theorem. 


Let us do this and also combine (3.1) and (3.2) for the result of this section, a Leibniz 
rule in the plane: 


<— [[ rorndxdy= | Fludy—odx) + [ff Sacay 
(3.3) “ do, 0Ds p. oF 


{f Joinery + 2] aca. 


0 0 . 
ay (Fu) + By FY) = (grad F)-:v + Fdivv. 


Here 


div(Fv) 


4. Aspace formula. Consider a fluid flowing through a region of space. The 
Lagrange (historical) description of the flow gives the position x = x(u, ft) at time t of 
the particle of fluid originally at point u. The Euler description gives the velocity 
v =v(x,t) at present time t of the particle now at position x. Suppose we are given a 
domain D, that moves with the flow. Suppose also we are given a function F(x,t) on 
the region of flow. The following formula, with a physicist’s proof, can be found in 
Prager [5], or Sokolnikoff and Redheffer [6]. 


mall F(x, t) dx dy dz ii} Fv-de all OF dxdydz 

dt D, oD p, ot 

{fl divi + a dx dy dz. 
>. at 


Here do is the vectorial area element on the closed surface @D, so that 


(4.1) 


do = (dy dz, dz dx, dx dy) = nda, 


where n is the outward unit normal and da is the element of area. We shall give a 
matHematical proof of (4.1), without worrying much about minimal smoothness 
conditions. Note that the two versions of the formula are equivalent by Gauss’ 
divergence theorem. 

We shall use index notation for coordinates. The initial position isu = (u',u?,u?), 
the moving point is x =(x',x?,x°), and the velocity is v=(v',v*,v*?) = x 
= (x',x?,x°). Dot denotes 0 /ét. 


1973] DIFFERENTIATION UNDER THE INTEGRAL SIGN 621 


We have a domain C in u-space, and for each t an imbedding ¢,: C -» D, of C 
into x-space. The mapping (u, t) — @,(u) is assumed twice continuously differentiable, 
and we write ¢,(u) = x(u,t), the Lagrange description. 

For fixed t, the Jacobian matrix of ¢, will be written 


ox _ ox! 
dus | ous | 
It is non-singular everywhere, and its inverse is du/éx = [ du’ /Ox']. Its determinant 
| Ox /du| is usually called the Jacobian of @,. 
We shall need a useful formula from determinant theory. If A = A(t) is a non- 
singular matrix function, then 
(4.2) ra = trace(AA7'). 


We apply (4.2) to the Jacobian matrix. First we note that (dx'/du/): = dx! /du/ 
= Ov'/du’, hence 


ace {(52) (Ga) = = ([Sa] [Se 
Ou Ou Out | | dx* 
y év' dul _y du' 


i,; Out Ox! Oxi 


= div Vv. 


The result is 


d | 0x 


(4.3) — - 2) (iv v). 


Now set 


f(t) = (If, F(x, t) dx} dx* dx?. 


By the change of variables rule, 


fo =[f[ rixu.n,e | 


Differentiation of this fixed domain integral is routine. We use (4.3) and then change 
back to D, as soon as possible: 


fit) = {if || = ait + S| = + F[x,¢] | (div wy} du! du? du 


= {[I [(erad F) ‘V+ Fdivv + a dx' dx? dx?. 
De 


du! du? du?. 


622 HARLEY FLANDERS [June—July 


But (grad F):v+ F div v=div (Fv), so formula (4.1) follows. The proof is not 
overwhelming once the ground has been paved. 


5. Flux across a moving surface. Suppose in a region of x-space we have a piece 
of surface S, that moves with time. We assume that S, is oriented, with vectorial area 
element do, and that S, is described by a map (u,t)—x(u,f), where u = (u',u*) 
varies over a domain C in the u-plane. The surface might also be considered as 
moving with a flow velocity v =v(x,t) as in Section 4. Suppose F(x,¢) is a time 
dependent vector field in the region, and set 


fo -{[ F-do, 


so that f(t) is the flux of the vector field F across the moving surface. The problem is 
to find f(t). Now obviously this is fresh ground. First of all, the domain of integration 
has smaller dimension than does the ambient space. Next, if we take the physicist’s 
point of view, and compare S, with S,,,, (as in Fig. 2), there won’t be an overlap in 
general, so we must expect a more complicated differentiation formula. In fact, the 
formula is 


(5.1) = || F-do = | (div F)v-da — | (vx F)-dx + [| F - do. 
dt JJs, St aSe St 


We might have guessed the second and third terms on the right because of (3.3), but 
the first term could not have been predicted from the previous discussion. Formula 
(5.1), with a physicist’s proof, appears in Abraham and Becker [1]. The method used 
to prove (4.1) is inadequate for proving (5.1). It is interesting to try it (formally) 
because it leads to the wrong answer and provides a good lesson in the care that 
must be exercised with several variable transformations. 

Instead of proving (5.1), we shall pass on to its natural generalization, concerned 
with a moving r-domain in n-space. 


6. Interior product. More than half the job of proving a generalization of (5.1) 
is formulating the result in a tractable language. First we must drop the idea of 
integrating a function with respect to a measure. What we integrate is an exterior 
differential form over an oriented field of integration (oriented chain). (Particular care 
must be taken with orientation, because it is so easy to get incorrect signs.) As soon 
as we take this new point of view, we see that the result we are after has nothing to do 
with the euclidean structure of space. The result is meaningful for any coordinate 
space, more generally for a differentiable manifold with no additional structure 
whatever. For an exposition of the theory of differential forms and their integrals, 
see any modern book on differential geometry or advanced calculus, especially 
Flanders [2]. 

A reasonable formulation of (5.1) in higher space necessarily uses some notation 


1973] DIFFERENTIATION UNDER THE INTEGRAL SIGN 623 


and some operations. One operation that is not widely known is the interior product, 
whereby a vector field and a p-form contract to a (p — 1)-form. 

If v is a vector field and « is a one-form, we write the effect of « on v (the dual 
pairing) as <v,a>. Thus 


; Oo 
ex?’ 


< dv Daj;dxy = Lv'a,. 


The interior product of v and a decomposable p-form w=a'! Aa* A+ Aa? is 
defined by 
(6.1) vi|l(@tas Ae) = Y(-1 la, dal avr admtaadlth awe anal, 


By linearity, v_|q@ is extended to all p-forms w. To prove that (6.1) really defines an 
operation that is independent of the representation of w as a linear combination of 
decomposable p-forms, it suffices to observe that the right-hand side of (6.1) is an 
alternating multilinear function of («',---,«?). 

Here are some examples. We set 


v= ovo a we 
ax TO Oy az! 


(To free ourselves of the euclidean “‘length and direction’’ concept of a vector, we 
consider a vector as a directional differentiation.) Then 


( v_|(Fdx + Gdy + Hdz) =uF + 0G + wH, 

v_ |(Fdy Adz+Gdzadx+Hdx a dy) 
62) = (wG — vH)dx + (uH — wF)dy + (oF — uG) dz, 

v_|(F dx Ady A dz)=F(udy A dz+vdzaAdx+wdx a dy). 
We may express these formulas in ordinary vector notation. Set F = (F,G, H). Then 

( v_|(F-dx)=v-F 
(6.3) < v_|(F-do) = — (vx F)-dx 
v_|(Fdx a dy a dz)=Fv-dae. 


We mention in passing two easily proved formulas: 


v_|(@ An) =(v_loa) An + (— 1)? a (v_In), 
u_I(v_|@) = —v_|(u_]Jo). 


7. The general Leibniz rule. We are given a p-dimensional time-dependent 
chain (field of integration) D, in n-space. We think of D, as a given by a map 


624 HARLEY FLANDERS [June-July 


: (u,t) > x(u, £), 


where u runs over a fixed domain C in the p-dimensional u-space. 
We also have an exterior p-form w whose coefficients are time-dependent. In loca] 
coordinates, 


(7.1) w= La,(x,t)dx™, dx" =dx™" na dx", 


where 1 Sh, <h,<-+»<h, Sn. We seek the derivative of {p,w. The answer is 


(7.2) a o= | vide+[vio+ | 6 
dt De D CD; Dy 


Here & = Ld, dx" if w is represented by (7.1). The exterior derivative d,@ is taken 
with respect to the space variables only. (Actually it would not matter if we included 
the dt term in dw because v_| would wipe it out.) Precisely, dw =d,w + dt A @ in 
(x,t)-space. As before v =x. 

Formula (7.2) has an attractive simplicity, and the presence of an exterior deriva- 
tive suggests that its proofinvolves Stokes’s theorem. Such a proof is not hard in 
itself, but requires careful preparation. We note that the other versions of the Leibniz 
rule we have discussed are all special cases of (7.2). This statement follows readily 
from (6.3). 

Here is yet another special case. Let C, be a moving curve in 3-space, so 
6C, = {x,(t)} — {x9(t)}. The motion is described by a velocity vector 

4) 0 


6) 
Va uay TP ay Tar? 


and v[x(t),t] =x , v[x,(t),t] =x,. We want 


£ | o Where w= F:dx. 

In the case of this line integral, d,w = (curl F)-do, and by (6.3), 
v_|Ja=v-F, v_jd,w = —[vx (curl F)]- dx. 

Therefore (7.2) specializes to 


«| Fdx=— | [v x (curl F)]-dx 
dt Ct Ct 


(7.5) 4 [Fx (0,4) 109) - F[xo(1),f]-X0(0) 


+ | F- dx. 
Ce 


8. Proof of (7.2). There is a technical advantage in taking the time variables 


1973] DIFFERENTIATION UNDER THE INTEGRAL SIGN 625 


first: signs are simplified. Thus we have 


ob: [a,b] XC R’, 


where [ a, b] is a closed interval on the t-axis, and C is a p-chain in R”, the u-space. 
We assume ¢ continuously differentiable, so it is actually defined on an open neigh- 
borhood of [a,b] ><C. We shall use the boundary formula 


o(La, b| XC) (dLa, b]) xC ~— La, b| X EC 
{b} XC — {a} XC — [a,b] XOC. 


We must review the process of integrating an exterior p-form over an (oriented 
differentiable singular) p-chain. Let « be a p-form in R" and w: C>R" a p-chain 
into the domain of a. The defining formula for integral is 


Nh - | ve, 


where w*(a is the p-form on C induced by w, so w*(a) = A(u) du! A +. A du”. Then 


I 


(8.1) 


[oc = fanaa tnd 


is an ordinary (Riemann) integral, and it may be iterated in any order. For example 


if C is a rectangle, 
~ {| Adu' a du* = {| Adu' du? 
Cc Cc 


{| A(u',u*)du* a du} 
C 
b d d b 
{ du! [ Adu? = | au? | Adu. 


In (7.2), the last term, [@, results from integrand variation only. As before, we 
shall use the chain rule for this part of the formula, thereby reducing to the case 
w = La,(x) dx”. This saves the introduction of additional spaces and mappings; 
there will be quite enough as it is. 

We write x = x(t,u) = f(t,u) and v =x = 0x/dt. We also introduce 


Pi C> R*, p,(u) = d(t,u). 


Each p-form on C may be considered as a p-form on [a,b] XC via the projection 
(t,u) >u. In particular we shall consider ¢f'@ as a p-form on [a,b] XC. We state 
two essential formulas: 


I 


grat dt a d*(v _|o) 
dt \ od: (Vv _|do). 


(8.2) rw 
d(p*o) 


Their proof is based on the decomposition o*(dx) = ¢7(dx) + v dt of 6*(dx) into the 


626 HARLEY FLANDERS [June—July 


part involving the space variables du/ and part involving dt. Then, for example, 
b*(dxi A+++ A dx!) = f*(dx') A+++ A b*(dx4) 
(bf dxi+vidt) a+: a (be dx! + v' dt) 
bf dx A+ A bf dxt+ dt a [v' de dx? a+ a b* dx! 
tore $ (= 1) bot OR dx? A A OF dxt*). 


The first formula follows easily. Now apply it to dw, noting that o*(dw) 
is a (p+ 1)-form on C, hence 0, so that ¢*(dw) =dt a dF(v_|daw). But d(¢*w) 


= b*(dw). 
Now we use Stokes’s theorem: 
(8.3) | d(o*w) = i) Oo. 
[a t]xC 0({a,t]x C) 


On the left, 


t 

| d(d*w) = | ds 0 b3(V _|da) -{ ds i) p*(v_|da). 
[a,t]xC [a,t]xC a Cc 

On the right, we have three terms according to (8.1). On the bases {a} C and 

{t! <C of the cylinder ¢ is constant, so dt A (_) in (8.2) makes no contribution. On 

the lateral side [a,t] OC of the cylinder, the p-form ¢7@ = 0, because it is 0 on 

the (p — 1)-chain 0C. Therefore 


oto gto- [|  dto — | ds.» o*(v_Jo) 
0([a.t]x C) (t]xc [ 


[a]xC at]xdoc 


[ eo - f ete - [as [ ot 0», 
Cc Cc a 6c 
Hence (8.3) implies 


[sto - [ote = fas [ azide) + [ia [ oF 10), 


that 1s, 


(8.5) [o-f o= fas [ vido + [as v_ ja. 
De Da a Ds a ODs 


We summon the fundamental theorem once again: 


<= | o= | video [| Vv jo. 
dt Jp, Dt aDt 


This completes the proof and our story. 


1973] STANFORD COMPETITIVE MATH. EXAM. 627 


REMARK: Because the terms in (7.2) are additive in D,, the formula is valid for 
the most genera] p-chain, a linear combination of coordinatized ones. 


References 


1. M. Abraham and R. Becker, Classical Theory of Electricity and Magnetism, 2nd ed., Blackie, 
London, 1950, pp. 39-40. 
2. H. Flanders, Differential Forms with Applications to the Physical Sciences, Academic Press, 


New York, 1963. 
3. W. Fleming, Functions of Several Variables, Addison-Wesley, Reading, Mass., 1965, pp. 


197-200. 
4, L. H. Loomis and S. Sternberg, Advanced Calculus, Addison-Wesley, Reading, Mass., 1968, 


pp. 419 and 456, 
5. W. Prager, Introduction to Mechanics of Continua, Ginn, Boston, 1961, pp. 75—76. 
6. I. S. Sokolnikoff and R. M. Redheffer, Mathematics of Physics and Engineering, 2nd ed., 


McGraw-Hill, New York, 1966, pp. 428-430. 


THE STANFORD UNIVERSITY COMPETITIVE 
EXAMINATION IN MATHEMATICS 


G. POLYA, Stanford University, and 
J. KILPATRICK, Teachers College, Columbia University 


1. Introduction. For twenty years, from 1946 to 1965, the Department of Mathe- 
matics at Stanford University conducted a competitive examination for high school 
seniors. The immediate and principal purpose of the examination was to identify, 


Prof. Polya received his Univ. Budapest degree in 1912 and holds honorary degrees from the 
E. T. H. Ziirich, Univ. Alberta, and Univ. Wisconsin. He taught at the E. T. H. until 1940 and has 
been at Stanford Univ. since. His numerous visiting posts include Cambridge, Oxford, Paris, G6Ottin- 
gen, and Princeton. He is a Correspondent of the Paris Academy of Sciences and holds honorary 
membership in the Council of the Soc. Math. de France, the London Math. Soc. and the Swiss Math. 
Soc, Prof. Polya received the M.A.A. Distinguished Service Award in 1963 and the 1968 N. Y. Film 
Festival top Blue Ribbon for ‘‘Let us teach guessing’’. 

The scientific contributions of George Polya include over 230 research papers and the books, 
Inequalities (with Hardy and Littlewood), How to Solve It, Isoperimetric Inequalities (with Szeg6), 
Mathematics and Plausible Reasoning (2 v.), and Mathematical Discovery (2v.). 

Prof, Polya’s personal influence on three generations of mathematicians has been enormous. 
Perhaps no book in existence has influenced the direction of thinking of young mathematicians 
more than his two volume masterpiece with G, Szegé, Aufgaben und Lehrsdtze aus der Analysis. 

Jeremy Kilpatrick took the Stanford Examination himself while a senior in high school; later 
he assisted in grading the Stanford Examination in its last few years. While a graduate student he 
worked closely with Professor Polya, and he received his Stanford Ph.D. in Education under E. G. 
Begle. He has since been Assistant and Associate Professor at Teachers College, Columbia. He works 
in the heuristics of problem solving and in mathematical abilities, and he is the co-editor with 
I, Wirszup of the series “Soviet Studies in the Psychology of Learning and Teaching Mathematics’’. 


Editor. 


628 G. POLYA AND J. KILPATRICK [June-July 


among each year’s high school graduates, singularly capable students and attract 
them to Stanford. The broader purpose was to stimulate interest in mathematics 
among high school students and teachers generally, as well as the public. 

The examination was modeled on the Eétvés Competition [see 3], which was 
organized in Hungary in 1894, and which, in turn, appears to have been suggested by 
similar competitions in England and France. Gabor Szegé, chairman of the Stanford 
Department of Mathematics in 1946 and winner of the E6tvés Competition in 1912, 
initiated the Stanford examination. 

The examination was established in the belief that an early manifestation of 
mathematical ability is a definite indication of exceptional intelligence and suitability 
for intellectual leadership in any field of endeavor. Furthermore, mathematical ability 
can be tested at a comparatively early age because it is manifested “‘not so much by 
the amount of accumulated knowledge as by the originality of mind displayed in the 
game of grappling with difficult though elementary problems [2, p. 406].” 

As Buck [1] noted some years ago in reviewing mathematical competitions, an 
examination can be designed, broadly speaking, to test either achievement or aptitude. 
The Stanford University Competitive Examination in Mathematics was of the latter 


type. It emphasized 


originality and insight rather than routine competence .... A typical question might call for 
specific knowledge within the reach of those being tested, but would call for the employment 
of this in unusual ways requiring a high degree of ingenuity. The question may in fact intro- 
duce certain concepts which are quite unfamiliar to the student. In short, the winning student 
is asked to demonstrate research ability [1, pp. 204-205]. 


The first Stanford examination, in 1946, was administered in 60 California 
high schools to 322 participants. The winner was awarded a one-year scholarship by 
Stanford University; honorable mention and a mathematics book were given to three 
other participants. In 1953, the examination was extended beyond California to 
include Arizona, Oregon, and Washington; the number of scholarships was increased 
to two; and the number of honorable mention awards and books was increased to 
ten or so. From 1958 to 1962, the examination was co-sponsored by Sylvania Electric 
Products, Inc. The last examination, in 1965, was administered to about 1200 partici- 
pants in 151 centers in California, Arizona, Idaho, Montana, Nevada, Oregon, and 
Washington. Cash prizes of $500, $250, and $250 were awarded to the three winners; 
honorable mention and a mathematics book went to eighteen participants. The 
examination was discontinued after 1965 mainly because the Stanford Department 
of Mathematics turned its interest to more graduate teaching. 

Announcements of the examination were sent each year to all public and private 
high schools in each state where the examination was to be administered. Larger 
schools were designated as centers; students from other schools were free to arrange 
to take the examination in a convenient location. The examination was administered 
by teachers and school personnel on a Saturday afternoon in March or April. The 


1973] STANFORD COMPETITIVE MATH. EXAM. 629 


participants were given three hours to attempt three to five problems. The following 
instructions were given: 


No books or notebooks may be used. You may not be able to do all the problems in three 
hours, but whatever you do should be carefully thought out. Scratch paper may be used. 
Either pen or pencil may be used. No questions concerning the test should be asked of the 


person in charge. 


Good presentation counts! 


It should be clear, concise, complete. 


The papers were read in a two-stage process: First, they were read by teams of 
graduate students in the Department of Mathematics, including, as was sometimes 
possible, graduate students who were experienced high school teachers. Each team 
of two students was assigned a problem to read in as many papers as they could 
handle. Papers containing either a stated minimum of good solutions (for example, 
one and a half or two out of four) or some special feature were forwarded to the 
second stage. In the second stage, each paper that survived the first screening was 
read by at least one faculty member of the Department. The papers considered most 
likely to be winners were read by all participating faculty members. 

To make the selection of winners easier, the problems were devised so that only a 
very few participants would be able to solve all of them. On the other hand, to avoid 
too much frustration, the first problem was usually more accessible than the others, 
especially in the later years, so that many participants were able to solve it. 

Although the mathematical content of the problems did not go beyond that of 
the high school curriculum, the problems were of types seldom found in textbooks. 
The purpose of such problems was not only to test the students’ originality, but also 
to enrich the high school mathematics program by suggesting some new directions 
for students’ and teachers’ work. The types of problems included (1) “‘guess and 
prove,” in which one first guesses and then proves a mathematical fact, (2) “‘test 
consequences,” in which one tests the consequences of a general statement, (3) 
“‘you may guess wrong,” in which a highly plausible guess is incorrect, (4) “‘small scale 
theory,” in which a sequence of subproblems illustrates theory construction, and (5) 
“‘red herring,” in which an obvious relationship among the data turns out to be 
irrelevant to the solution [see 6, pp. 160-161, ex. 1; 8, p. 139, ex. 14.23]. 

- The problems were of the sort used as illustrations in How to Solve It [4], the 
two volumes of Mathematics and Plausible Reasoning [5, 6], and the two volumes 
of Mathematical Discovery [7,8]. In fact, many of the problems appear, usually with 
solutions, in these books. The interested reader is directed, in particular, to the 
following sources: 

1. Part IV of [4] contains problems taken (with a few minor changes) from the 
1946-56 examinations. Hints and solutions are provided. 


630 G. POLYA AND J. KILPATRICK [June-July 


2. Chapters I and VII of [5] may be useful in attacking problems involving ind uc- 
tion. Several of the problems from the 1946-50 examinations are included among 
the examples and comments at the end of the chapters. 

3. The examples and comments on Chapter XVI of [6] contain some problems 
from the 1946-52 examinations. The appendix added in the second edition contains 
additional problems, one of which is taken from the 1958 and one from the 1964 
examination. 

4. Chapters 2 and 6 of [7] discuss and elaborate Descartes’ method for solving 
problems. Some of the examples used to illustrate the method, and several of the 
problems at the end of the chapters, are taken from the 1951-61 examinations. The 
appendix of [7] gives some suggestions for teachers on how to use such problems in 
class. 

5. Chapter 15 of [8] illustrates the use of research problems — including one 
from the 1965 examination — to provide students something of an opportunity for 
independent creative work. Additional problems from the 1961-65 examinations 
are given at the end of the chapter and in the appendix to the corrected printing. 

Each year the examination was conducted (with six exceptions), an article listing 
the problems and the winners was published either in this MONTHLY or in the Cali- 
fornia Mathematics Council Bulletin. Some hints and solutions also were published 
in the latter journal. The problems have never before, however, been collected together 
as a set. 

Section 2 contains the complete set of problems from the Stanford University 
Competitive Examination in Mathematics. They are numbered as follows: ‘‘46.1”’ 
refers to problem | in the 1946 examination. Owing to space limitations, hints and 
solutions are not provided. A booklet containing the problems and a complete set 
of hints and solutions, many developed in seminars on problem solving at Stan- 
ford University and at Teachers College, together with specific references to articles 
and books in which the problems have previously appeared, will be published 
under the title The Stanford Mathematics Problem Book by the Teachers College 
Press, Teachers College, Columbia University, New York, NY 10027. 


2. Problems 


46.1. In a tennis tournament there are 2n participants. In the first round of the tournament each 
participant plays just once, so there are n games, each occupying a pair of players. Show that the 
pairing for the first round can be arranged in exactly 


1x3x5xX7xX 9+. X (Qn — 1) 


different ways. 


46.2. In a tetrahedron (which is not necessarily regular) two opposite edges have the same length 
a and they are perpendicular to each other. Moreover they are each perpendicular to a line of length 
b which joins their midpoints. Express the volume of the tetrahedron in terms of a and 6, and prove 
your answer. 


1973] STANFORD COMPETITIVE MATH. EXAM. 631 


46.3. Consider the following four propositions, which are not necessarily true. 

I. If a polygon inscribed in a circle is equilateral it is also equiangular. 

II. If a polygon inscribed in a circle is equiangular it is also equilateral. 

Ill. If a polygon circumscribed about a circle is equilateral it is also equiangular. 

IV. If a polygon circumscribed about a circle is equiangular it is also equilateral. 

(A) State which of the four propositions are true and which are false, giving a proof of your 
statement in each case. 

(B) If, instead of general polygons, we should consider only quadrilaterals which of the four 
propositions are true and which are false? And if we consider only pentagons? 

In answering (B) you may state conjectures, but prove as much as you can and separate clearly 
what is proved and what is not. 


47.1. To number the pages of a bulky volume the printer used 1890 digits. How many payes 
has the volume? 


47.2. Among grandfather’s papers a bill was found: 
72 turkeys $-67.9- 


The first and last digit of the number that obviously represented the total price of those fowls 
are replaced here by blanks, for they have faded and are now illegible. 
What are the two faded digits and what was the price of one turkey ? 


47.3. Determine m so that the equation in x 
x4 — (3m + 2) x2 + m= 0 


has four real roots in arithmetic progression. 
47.4. Let a, 8 and y denote the angles of a triangle. Show that 
, , , a p_y 
sin a -+ sin f + sin y= 4 cos —- cos —cos—, 
2 2 2 
sin 2a + sin 28 + sin 2y = 4 sin a Sin B sin y, 
and 


sin 4a + sin 48 + sin 4y = — 4 sin 2a sin 28 sin 2). 


48.1. Consider the table: 


1 = | 

2+3+4 = 1+ 8 
5+6+7+8+49 = 8+27 

10+ 11412+134+144+15+16 = 27464 


Guess the general law suggested by these examples, express it in suitable mathematical notation, and 
prove it. 


48.2, Three numbers are in arithmetic progression, three other numbers in geometric progression. 
Adding the corresponding terms of these two progressions successively, we obtain 


85, 76, and 84 


632 G. POLYA AND J. KILPATRICK [June-July 


respectively, and adding all three terms of the arithmetic progression, we obtain 126, Find the terms 
of both progressions. 


48.3. From the peak of a mountain, you see two points, A and B, in the plain. The lines of vision, 
directed to these points, include the angle y. The inclination of the first line of vision to a horizontal 
plane is a, that of the second line £. It is known that the points A and B are on the same level and that 
the distance between them is c. 

Express the elevation x of the peak above the common level of A and B in terms of the angles 
a, B, y and the distance c. 


48.4. A first sphere has the radius rj. About this sphere circumscribe a regular tetrahedron. 
About this tetrahedron circumscribe a second sphere with radius r2. About this second sphere cir- 
cumscribe a cube. About this cube circumscribe a third sphere with radius r3. 

Find the ratios r;: 72: 73 (which should be, according to Kepler, the ratios of the mean distances 
of the planets Mars, Jupiter, and Saturn from the Sun, but which are, in fact, rather different from 
the true ratios). 


49.1. Prove that no number in the sequence 
11, 111, 1111, 11111, --- 


is the square of an integer. 


49.2. The three sides of a triangle are of lengths /, m, and n, respectively. The numbers /, m, and 
n are positive integers, 


(A) Take n= 9 and find the number of different triangles of the described kind. 
(B) Take various values of 7 and find a general law. 


49.3. (A) Prove the following theorem: A point lies inside an equilateral triangle and has the dis- 
tances x, y, and z from the three sides respectively; h is the altitude of the triangle. Then x +y+z=h. 


(B) State precisely and prove the analogous theorem in solid geometry concerning the distances 
of an inner point from the four faces of a regular tetrahedron. 

(C) Generalize both theorems so that they should apply to any point in the plane or space, res- 
pectively (and not only to points inside the triangle or tetrahedron). Give precise statements and, if 
you have time, also proofs. 


50.1. Observe that 
1= 1 


1—4 


I 


— (1 + 2) 
1-44+9= 14243 


1-44+9-—16= —(1+2+344) 


Guess the general law suggested by these examples, express it in suitable mathematical notation, 
and prove it. 


50.2. Given a square. Find the locus of the points from which the square is seen under an angle 
(A) of 90° (B) of 45°. (Let P be a point outside the square, but in the same plane. The smallest angle 


1973] STANFORD COMPETITIVE MATH. EXAM. 633 


with vertex P containing the square is the “tangle under which the square is seen” from P.) Sketch 
clearly both loci, give a full description, and a proof. 


50.3. Call ‘‘axis”’ of a solid a straight line joining two points of the surface of the solid and such 
that the solid, rotated about this line through an angle which is greater than 0° and less than 360° 
coincides with itself. 

A cube has 13 different axes, which are of three different kinds. Describe clearly the location of 
these axes, find the angle of rotation associated with each. Assuming that the edge of the cube is of 
unit length, compute the arithmetic mean of the lengths of the 13 axes. Do not use tables and com- 
pute to two decimals. 


51.1. The length of the perimeter of a right triangle is 60 inches and the length of the altitude 
perpendicular to the hypotenuse is 12 inches. Find the sides of the triangle. 


51.2. A quadrilateral is cut into four triangles by its two diagonals. We call two of these triangles 
“opposite” if they have a common vertex but no common side. Prove the following statements: (A) 
The product of the areas of two opposite triangles is equal to the product of the areas of the other two 
opposite triangles. (B) The quadrilateral is a trapezoid if, and only if, there are two opposite triangles 
equal in area. (C) The quadrilateral is a parallelogram if, and only if, all four triangles are equal in 
area. 


51.3. We consider the frustum of a right circular cone. The plane that is parallel to the lower and 
upper bases of the frustum and at equal distance from both intersects the frustum in the ‘‘median 
circle.’’ The frustum and a cylinder have the same altitude, and the median circle of the frustum is 
the base of the cylinder. Which one of these two solids has the greater volume, the frustum or the 
cylinder ? Prove your answer! (A possible proof is by algebra: Express both volumes in terms of suita- 
ble data and transform their difference so that its sign becomes obvious.) 


52.1. Prove the proposition: If a side of a triangle is less than the average (arithmetic mean) of 
the two other sides, the opposite angle is less than the average of the two other angles. 


52.2. Consider the frustum of a right pyramid with square base. Call ‘“‘midsection”’ the inter- 
section of the frustum with a plane parallel to the base and the top and at the same distance from both. 
Call ‘‘intermediate rectangle’ the rectangle of which one side is equal to a side of the base and the 
other side is equal to a side of the top. 

Four different friends of yours agree that the volume of the frustum equals the altitude multi- 
plied by a certain area, but they disagree and make four different proposals regarding this area: 

I. the midsection, 

II. the average of the base and the top, 

Ill. the average of the base, the top, and the midsection, 

IV. the average of the base, the top, and the intermediate rectangle, 


Let / be the altitude of the frustum, a the side of its base, and 4 the side of its top. Express each 
of the four proposed rules in mathematical notation, decide whether it is right or wrong, and prove 
your answer. 


(52.3. Prove that the only solution of the equation 


x2 + y2 + 22= Axyz 


in integers x, y,andzisx= y= z= 0. 


53.1. Bob has 10 pockets and 44 silver dollars. He wants to put his dollars into his pockets so 
distributed that each pocket contains a different number of dollars. 
(A) Can he do so? 


634 G. POLYA AND J. KILPATRICK [June-July 


(B) Generalize the problem, considering p pockets and n dollars. The problem is the most inter- 
esting when 


new PLD OD. 
2 
Why? 
53.2. Observe that the value of 
1 2 3 n n n 
2! 3! 4! (n+ 1)! 


is 1/2, 5/6, 23/24 for n= 1, 2, 3, respectively, guess the general law (by observing more values if 
necessary) and prove your guess. 


53.3. Find x, y, u, and v satisfying the system of four equations 
x+7y+3v+5u= 16 
8x + 4y + 6v + 2u= —16 
2x+6oy+4v+8u= 16 


Sx + 3y + 7v+ u= —16 


(This may look long and boring: look for a shortcut.) 


53.4. The four points G, H, V, and U are (in this order) the four corners of a quadrilateral. 


A surveyor wants to find the length UV = x. He knows the length GH =/ and measures the 
four angles 


/GUH= a, /HUV= f, ZUVG= y, /GVH= 6. 


(A) Express x in terms of a, f, y, 6, and J. 
(B) Find some way to test the correctness of the result. 
(C) If you had a clear plan to do (A) characterize it in one short sentence. 


54.1. Consider the table 


1 = 1 

3 +5 = 8 
7+9+ 11 = 27 

13 +15+17+ 19 = 64 


21 + 23 + 25 + 27+ 29= 125 


Guess the general law suggested by these examples, express it in suitable mathematical notation, and 
prove it. 


54.2. The side of a regular hexagon is of length x (x is an integer). By equidistant parallels to its 
sides, the hexagon is divided into T equilateral triangles each of which has sides of length 1. Let V 


1973] STANFORD COMPETITIVE MATH. EXAM. 635 


denote the number of vertices appearing in this division, and L the number of boundary lines of 
length 1. (A boundary line belongs to one or two triangles, a vertex to two or more triangles.) When 
= 1, which is the simplest case, T= 6, V= 7, L= 12. 
Consider the general case and express T, V, and L in terms of n. (Guessing is good, proving is better.) 
54.3. Show that it is impossible to find (real or complex) numbers a, b, c, A, B, and C such that 
the equation 


x2 + y2 + 72 == (ax -+ by + cz) (Ax + By + Cz) 


holds identically for independently variable x, y, and z. 


55.1. Bob wants a piece of land, exactly level, which has four boundary lines. Two boundary 
lines run exactly north-south, the two others exactly east-west, and each boundary line measures 
exactly 100 feet. Can Bob buy such a piece of land in the U.S. ? State your reasons! 


55.2. (A) Find three numbers p, g, and r so that the equation 
x4 +- 4x3 — 242 — 12% + 9= (px2 + qx +7)? 


holds identically for variable x. 
(B) This problem requires the “‘exact”’ extraction of a square root of a given polynomial of degree 
4, which may be possible in the present case, yet usually it is not. Why not? 


55.3. Bob, Peter, and Paul travel together. Peter and Paul are good hikers; each walks p miles 
per hour. Bob has a bad foot and drives a small car in which two people can ride, but not three; 
the car covers c miles per hour. The three friends adopted the following scheme: They start together, 
Paul rides in the car with Bob, Peter walks. After a while, Bob drops Paul who walks on; Bob returns 
to pick up Peter, and then Bob and Peter ride in the car till they overtake Paul. At this point, they 
change: Paul rides and Peter walks just as they started and the whole procedure is repeated as often 
as necessary. 

(A) How much progress (how many miles) does the company make per hour? 

(B) Through which fraction of the travel time does the car carry just one man? 

(C) Check the extreme cases p= 0 and p= c. 

55.4. The vertex of a pyramid opposite the base is called the apex. (A) Let us call a pyramid “‘isos- 
celes” if its apex is at the same distance from all vertices of the base. Adopting this definition, prove 
that the base of an isosceles pyramid is inscribed in a circle the center of which is the foot of the pyr- 
amid’s altitude. 

(B) Now let us call a pyramid “‘isosceles”’ if its apex is at the same (perpendicular) distance from 
all sides of the base. Adopting this definition (different from the foregoing) prove that the base of 
an isosceles pyramid is circumscribed about a circle the center of which is the foot of the pyramid’s 
altitude. 

56.1. Given a regular hexagon and a point in its plane. Draw a straight line through the given 
point that divides the given hexagon into two parts of equal area. 

56.2. I say that you can pay 50 cents in exactly 50 different manners. (The “‘manner’’ depends on 
how many coins of each kind — cents, nickels, dimes, quarters, half dollars — you use.) In how 
many manners can you pay 25 cents? Am I right about 50 cents? Justify your answer as clearly as 
you can. 


56.3. Construct a hexagon by adding to an arbitrarily given triangle A three exterior isosceles 
triangles each of which has an angle of 120° opposite to that side of A that forms its base. Show that 


636 G. POLYA AND J. KILPATRICK [June-July 


those three vertices of the hexagon that are not vertices of the given A are the vertices of an equilateral 
triangle. (It is enough to express just one side s of the allegedly equilateral triangle in terms of the 
sides a, b, and c of A, provided that this expression for s is symmetric in a, b, and c.) 

56.4. Ten people are sitting around a round table. The sum of ten dollars is to be distributed 
among them according to the rule that each person receives one half of the sum that his two neighbors 
receive jointly. Is there just one way to distribute the money? Prove your answer. 

57.1. Bob’s stamp collection consists of three books. Two tenths of his stamps are in the first 
book, several sevenths in the second book, and there are 303 stamps in the third book. How many 
stamps has Bob? (Is the condition sufficient to determine the unknown ?) 

57.2. We call a vertex of a tetrahedron ¢rirectangular if the three edges starting from it are per- 
pendicular to each other. Given the areas A, B, and C of the three faces adjacent to the trirectangular 
vertex of a tetrahedron, find the area D of the fourth face, opposite to that vertex. (Which problem 
of plane geometry would you regard as analogous?) 

57.3. Divide a given triangle by three straight cuts into seven pieces four of which are triangles 
(and the remaining three pentagons). One of the triangular pieces is included by the three cuts, each 
of the three other triangular pieces is included by a certain side of the given triangle and two cuts. 

(A) Choose the three cuts so that the four triangular pieces turn out to be congruent. Describe 


your choice precisely and draw a clear figure. 
(B) Which fraction of the area of the given triangle is the area of a triangular piece in the dissec- 


tion that you chose? 
(It may be advantageous to examine first a particular shape of the given triangle for which the 


solution is particularly easy.) 


58.1. How old is the captain, how many children has he, and how long is his boat? Given the 
product 32118 of the three desired numbers (integers). The length of the boat is given in feet (is several 
feet), the captain has both sons and daughters, he has more years than children, but he is not yet one 


hundred years old. (Give reasons for your answer.) 


58.2. Find x, y, u, and y satisfying the system of four equations: 
x+ytu= 4 
yrut+yvs —5 
utv+x= 0 


v+x+y= —8 


(This may look long and boring: look for a shortcut.) 


58.3. “In any triangle the sum of the three... is greater than the semiperimeter.”’ 
Replace the dots ... successively by 

I. altitudes 

II. medians 


III. bisectors (of the angles). 
You obtain so three different assertions. Examine each assertion: is it true or false? Prove your 


answer ! 


58.4. Observe that the value of 


V1 + 212+ 313 + +++ +ula 


1973} STANFORD COMPETITIVE MATH. EXAM. 637 


is 1, 5, 23, 119 for n= 1, 2, 3, 4, respectively. Guess the general law (by observing more values if 
necessary) and prove your guess, 


59.1. Al and Bill live at opposite ends of the same street. Al had to deliver a parcel at Bill’s home, 
Bill one at Al’s home. They started at the same moment, each walked at constant speed and returned 
home immediately after leaving the parcel at its destination. They met the first time at the distance of 
a yards from Al’s home and the second time at the distance of b yards from Bill’s home. 

(A) How long is the street? 

(B) If a= 300 and 6= 400 who walks faster? 

59.2. Pennies (equal circles) are arranged in a regular pattern all over a very-very large table 
(the infinite plane). We examine two patterns. 

In the first pattern, each penny touches four other pennies and the straight lines joining the centers 
of the pennies in contact dissect the plane into equal squares. 

In the second pattern, each penny touches six other pennies and the straight lines joining the 
centers of the pennies in contact dissect the plane into equal equilateral triangles. 

Compute the percentage of the plane covered by pennies (circles) for each pattern. 


59.3. Prove: If nis an integer greater than 1, n"—1 — 1 is divisible by (” — 1)2. 


59.4. Erect an (exterior) square on each side of an (arbitrarily given) triangle. Those 6 vertices 
of these 3 squares that do not coincide with a vertex of the triangle form a hexagon, Three sides of 
this hexagon are, of course, equal to the corresponding sides of the triangle. Show that each one of 
the remaining three sides equals the double of a median of the triangle. 


60.1. A certain make of ball point pen was priced 50 cents in the store opposite the high school 
but found few buyers. When, however, the store had reduced the price, the whole remaining stock 
was sold for $31.93. What was the reduced price? (Is the condition sufficient to determine the un- 
known?) 


60.2. The point P is so located in the interior of a rectangle that the distance of P from a corner 
of the rectangle is 5 yards, from the opposite corner 14 yards, and from a third corner 10 yards. What 
is the distance of P from the fourth corner? 


60.3. Prove the identity 


sin a 


a a a 
cos —cos —cos ~ = ————. 
2 4 8 8 sina/8 


and generalize. 


60.4. Of twelve congruent equilateral triangles eight are the faces of a regular octahedron and 
four the faces of a regular tetrahedron. Find the ratio of the volume of the octahedron to the volume 
of the tetrahedron. 


61.1. Solve the following system of three equations for the unknowns x, y, and z: 
5732x + 2134y + 2134z= 7866, 
2134x + 5732y + 2134z= 670, 
2134x + 2134y + 5732z= 11464. 


61.2. It was a very hot day and the 4 couples drank together 44 bottles of coca-cola. Ann had 2, 


638 G. POLYA AND J. KILPATRICK [June-July 


Betty 3, Carol 4 and Dorothy 5 bottles. Mr. Brown drank just as many bottles as his wife, but each 
of the other men drank more than his wife: Mr. Green twice, Mr. White three times and Mr. Smith 
four times as many bottles. Tell the last names of the four ladies. (Prove your answer.) 


61.3. Solve the following system of three equations for the unknowns x, y and z (a, b and c are 
given): 


x2y2 + x272= axyz, 
yrz2 + y2x2 = bxyz, 
z2x2 + z2y2 = cxyz, 


61.4. A pyramid is called ‘‘regular”’ if its base is a regular polygon and the foot of its altitude is 
the center of its base. A regular pyramid has a hexagonal base the area of which is one quarter of the 
total surface-area S of the pyramid. The altitude of the pyramid is A. Express S in terms of A. 


62.1. Solve the system 
2x2 — 4xy + 3y2 = 36 
3x2 — 4xy + 2y2= 36 


(One solution is easy to guess, but you are required to find all solutions. Knowledge of analytic 
geometry is not needed to solve this problem, but may help to understand the result — how?) 

62.2. Each of the four numbers, a, 6, c, and d, is positive and less than one. Show that not all 
four products 


4a(1 — 6), 46(1 — ©), 4c(1 — d), 4d(1 — a) 


are greater than one. 


62.3. On each side of a right triangle, erect an exterior square (as it is usually done to illustrate 
Pythagoras’ theorem). Join the vertex of the triangle’s right angle to the center of the square on the 
hypotenuse, and join the centers of the squares on the other two sides. Show that the two line seg- 
ments so obtained are 

(A) perpendicular to each other and 

(B) of equal length. 


62.4. Five edges of a tetrahedron are of the same length a, and the sixth edge is of the length 6. 

(A) Express the radius of the sphere circumscribed about the tetrahedron in terms of a and 8. 

(B) How would you use the result (A) to determine practically the radius of a spherical surface 
(of a lens)? 


63.1. In a right triangle, c is the length of the hypotenuse, a and # are the lengths of the two 
other sides, and d is the length of the diameter of the inscribed circle. Prove that 


at+b=c+d. 
63.2. Show that the expression 
n2 (n2 — 1) (n2 — 4) 


is divisible by 360 for n= 1, 2,3,---. 


1973] STANFORD COMPETITIVE MATH. EXAM. 639 


63.3. Solve the system of three equations for the unknowns x, y, and z, giving all solutions: 
x2 + S5y2 + 622 -+- B(yz + zx + xy) = 36, 
6x2 + y2 + 522 + B(yz + zx + xy) = 36, 
5x2 -+ 6y2 + 22 + B(yz + zx + xy) = 36, 


(One solution is easy to find.) 


63.4. The base of a right prism is a regular hexagon, and the height of the prism is equal to the 
diameter of the circle inscribed in the base. The volume of the prism is equal to the volume of a 
regular octahedron. Find the ratio of the surface-areas of these two solids. 

Observe that the two solids have the same number of faces, and one of them is a regular solid, 
but the other is not. Any remark? 


64.1. A cake has the shape of a right prism with a square base; it has icing on the top as well as 
on the sides (that is, on the four lateral faces). The altitude of the prism is 5/16 of the side of its base. 
Cut the cake into 9 pieces so that each piece has the same amount of cake and the same amount of 
icing. One of the 9 pieces should be a right prism with a square base with icing only on the top: Com- 
pute the ratio of its altitude to a side of its base and give a clear description, with an acceptable 
sketch, of all 9 pieces. 


64.2. Show that each number of the sequence 
49, 4489, 444889, 44448889... 


is a perfect square. 


64.3. If the area of a triangle is rational (that is, measured by a rational number) there are four 
thinkable cases: The triangle may have three or two rational sides, or just one or no rational side. 
Show by (preferably simple) examples that all four cases are actually possible. 


64.4. An examination in three subjects, Algebra, Biology, and Chemistry was taken by 41 stu- 


dents. The following table shows how many students failed in each single subject and in their various 
combinations: 


in A B C AB AC BC ABC 
failed 12 5 8 2 6 3 1 


(For instance, 5 students failed in Biology, among whom there were 3 failing both in Biology and in 
Chemistry, and just one of these 3 failed in all three subjects.) How many students passed in all 
three subjects ? 


(Can you think of a suitable diagram that would clarify the underlying idea 7?) 


64.5. Let a, b, and c denote the lengths of the sides of a triangle, and d the length of the bisector 
of the angle opposite to the side of length c, terminated on the side. 
(A) Express d in terms of a, b, and c. 


(B) Check the expression obtained in as many ways as you can (by particular cases, limiting 
cases, and so on). 


65.1. ‘“‘“How many children have you, and how old are they ?”’ asked the guest, a mathematics 
teacher. 


“I have three boys,” said Mr. Smith. “‘The product of their ages is 72 and the sum of their ages 
is the street number.” 


640 G. POLYA AND J. KILPATRICK 


The guest went to look at the entrance, came back and said: “The problem is indeterminate.” 

‘*“Yes, that is so,” said Mr. Smith, “but I still hope that the oldest boy will some day win the 
Stanford competition.” 

Tell the ages of the boys, stating your reasons. 


65.2. Of a right triangle, given the length of the hypotenuse c and the area A. On each side of the 
triangle, describe a square exterior to the triangle and consider the least convex figure containing the 
three squares (formed by a tight rubber band around them): it is a hexagon (which is irregular, has 
one side in common with each square, and one of its remaining three sides is obviously of length c). 

Find the area of the hexagon. 


65.3. Let the numbers x, y and 1 measure the lengths of the three sides of some triangle and 
suppose that 


Let the point (x, y) with rectangular coordinates x and y represent the triangle on a plane. Describe 
precisely and sketch clearly the set of those points of the plane that, in the manner explained, represent 

(A) triangles, 

(B) isosceles triangles, 

(C) right triangles, 

(D) acute triangles, 

(E) obtuse triangles. 

Locate the representative points of still other noteworthy triangular shapes. 


65.4. Find the remainder of the division of the polynomial x + x9 + x25 + x49-+ x81 by the 
polynomial x3 — x, 


References 


1. R. Creighton Buck, A look at mathematical competitions, this MONTHLY, 66 (1959) 201-212. 

2. Department of Mathematics, Stanford University, The Stanford University Mathematics 
Fxamination, this MONTHLY, 53 (1946) 406-409. 

3. Hungarian Problem Book I, II, Random House, New York, 1963. 

4. G. Polya, How to Solve It, 2nd edition, Doubleday Anchor A 93, 1957, Princeton Univ. 
Press, 1971. 


5. ——, Mathematics and Plausible Reasoning, Vol. 1, Princeton Univ. Press, 1954. 

6. —_——-, Mathematics and Plausible Reasoning, Vol. 2, 2nd edition, Princeton Univ. Press, 
1968. 

7. , Mathematical Discovery, Vol. 1, Wiley, New York, 1962. 


, Mathematical Discovery, Vol. 2, corrected printing, Wiley, New York, 1968. 


HOW TO CLASSIFY DIFFERENTIAL POLYNOMIALS 
REUBEN HERSH, University of New Mexico 


Introduction. What can it mean to say that a polynomial is ‘‘elliptic,’’ ‘‘hyper- 
bolic,’’ ‘‘parabolic,’’ if its degree is not equal to two? The intent of this article is to 
explain the meaning of these words. In doing so, we shall see how the Fourier trans- 
formation provides a bridge between elementary algebra and advanced analysis. 
Natural questions about differential operators lead to interesting problems in 
elementary algebra. Nothing we discuss in this article is new. Our purpose is to 


expound some interesting elementary notions which are usually to be found embedded 
in highly technical treatises. 


Consider a polynomial in n+ 1 variables, 


l m 
(1) Q(a, b) = » & c,a'b!, 
i=1 j=i 
where j is a multi-index j = j,,--+,j,, b is an n-tuple, and b/ = IT b+. 
Associated with Q is an n + 1-dimensional differential operator L: 


L = Q(d/dt,d/dy) = Le, (d /aty(d [ayy 


Since L is determined by Q, the analytic properties of L evidently must follow from 
the algebraic properties of Q. The classification problem is the problem of finding 
correspondences between the analytic structure of L and the algebraic structure of Q. 

In the classical cases, QO is quadratic, and a linear transformation of independent 
variables reduces L to a sum of squares, or to a single linear term plus a sum of 
squares. We obtain a classification for differential operators which corresponds to 
the classification of quadratic surfaces. If the number of independent variables is 2, 
we find exactly three ‘‘types’’ of second-order differential operators, which correspond 
to the three types of non-degenerate conic sections. 

By a miraculous providence, to each of these three types of second-order equations 
there corresponds a famous problem of classical physics. Let Q,(a,b) denote the 
parabolic polynomial a—b*. Then Lyu=u,—y,,=0 is the parabolic ‘‘heat 
equation,’ which is satisfied by the temperature in a homogeneous heat conductor. 
The hyperbolic form Q, =a? —b* is associated with the ‘‘wave equation,”’ 
Lu = Uy — uy, = 0, which describes small vibrations of an elastic medium. And if 
we interpret t as a spatial coordinate and let 0, = a* + b*, then the equilibrium 


state of a homogeneous medium (e.g., thermal equilibrium of a heat conductor) 
satisfies the “‘potential equation,” Lpu = u, + uy, = 0. 


Reuben Hersh received his Ph.D. at NYU under Peter Lax. He has held positions at Fairleigh 
Dickinson, Stanford University, and the University of New Mexico. He spent a year leave at the 
Courant Institute. His main research is in Partial Differential Equations and in Random Processes. 
He has co-authored several recent articles in the Scientific American. Editor. 


641 


642 REUBEN HERSH [June-July 


These physical interpretations help give us an orientation on the operator L, 
For instance, how many initial conditions should be given at t=0 if Lu =0 for 
positive t? 

Experience with ordinary differential equations suggests a number of independent 
initial conditions equal to / (the order of Lwith respect to d /dt). This answer is correct 
for the wave and heat equations, but not for the potential equation. This discrepancy 
is easy to understand in term of the associated physical models. But from a mathema- 


tical viewpoint, the difference has to be explained by an algebraic difference between 
a? + b? on the one hand, and a? — b? or a — b* on the other. 


DEFINITION 1. We will say Cauchy’s problem is correct for L if, for arbitrary f,(y) 


which are in C5 (infinitely differentiable with compact support) there is a unique 
solution of 


(2a) Lu =0 in t>0, 
(2b) (d/dt)'u(0, y)=f, OSkSI1-1. 


We do not demand that the solution u be in C? but only that it have as many 
derivatives as are required to be in the domain of Lu, and that these derivatives be in 
L, with respect to y, for each fixed t. 

Correctness in this sense is an intrinsic property—a property of the solutions u 
of Lu =0. We are asking for the corresponding formal property—an algebraic 
property of Q which should, in particular, be possessed by Q,, and Qy but not by Qp. 

The three classical equations differ, not only in the number of conditions to give 
a well-posed problem, but also in the qualitative behavior of their solutions. The 
parabolic heat equation has a smoothing property: the solution uy is infinitely 
differentiable, for positive t, even if the initial values fare discontinuous. 

We can see this directly from the solution formula 


uy = (4nt)-* | f@exp(—(w ~ F /AN dk 


DEFINITION 2. We say L is intrinsically parabolic if it is correct for Cauchy’s 
problem even for initial data which are arbitrary members of L,(R”), and if, moreover, 
the solution is infinitely differentiable. 

What algebraic property of Q,, = a — b* corresponds to this intrinsic property of 
Ly? 

Solutions of the wave equation have an intrinsic property of great physical 
interest, the property of finite signal speed. By this we mean that if the initial data, 
fo and f,, have support in | y| < M then, for some “‘signal speed’’ c, the solution at 
time ¢ has support in | y| <M +ct. 

We can see this directly from D’Alembert’s formula, 


1973] HOW TO CLASSIFY DIFFERENTIAL POLYNOMIALS 643 


ly =4 oly +)tho-9+{- f,(o)ds|. 


In this case, c = 1. 


DEFINITION 3. We say L is intrinsically hyperbolic if it is correct for Cauchy’s 
problem and if, for some c, it has finite signal speed. 

We have to find the formal property, possessed in particular by Oy = a? — b?, 
which corresponds to this intrinsic property. 

Finally, for the potential equation, there is the property that the solution is 
determined, both for t > 0 and for t <0, if its value is given for t = 0 (Dirichlet’s 
problem). We shall use this to give a definition of intrinsic ellipticity, and seek a 


corresponding formal property. 
We shall see that the Fourier transform answers all these questions with surprising 


ease and precision. Recall that if f(y) is in L,(R") and 
(3a) Ff = fon) = 20)" | exp(— iny) Ody. de 
then f(y) is in L,(R") and 
(3b) F-'f=fy)=(Qnr"” { exp (iny)f(n) dn, +> dny. 
For u(t, y), we denote by Fu =U(t,n) the transform in y alone, ¢ remaining 
constant. If f and all its derivatives are in L,, integration by parts gives 
(4) F ((d |dy)‘f) = Cin)'f (n). 
Since F is a linear operator, we get, combining (3) and (1), 
F(O(d |dt,d /dy)u] = F ¥ c,d /dt)'(d |dy)iu 
X ¢;,(d /dt)(in) ti = Q(d /dt, in)a. 


From (4) we see also that if df/dy exists and is in L,, then nf is in L,. In other 
words, the smoother f, the more rapid the decay of f at infinity. In the same way, 
using (3b), we see that the more rapidly f decays, the smoother is f. 


F\|Lu| 


(5) 


2. Equations correct for Cauchy’s problem. Now we are ready to use the Fourier 
transform to classify Q. The first problem is to identify Q for which Cauchy’s problem 
(2a, b) is correct. In view of (5) @ satisfies 


(6a) Q(d /dt, in)i = 0, 
(6b) (d/dty'tO)=f,, OSkSI-1 


Equation (6a) is an ordinary differential equation in t. The coefficients are functions 
of the parameter yn, but do not depend on t, so for each yn it is solved in the same 


644 REUBEN HERSH [June-July 


explicit elementary way as an ordinary differential equation with constant coefficients 
One simply writes down the general solution 


INp 


(7) i= X% XL ys(n)t* exp (1,(y)0), 


where 1,(y) is a root of Q(t,in) = 0 with multiplicity m, — 1. The coefficients y,, are 
found by solving the linear algebraic system of equations which results from sub- 
stituting (7) into (6b). 

In the case of the wave equation Qy = a* — b*, we have | = 2 and we find 


(8) 
i ~i Ll a); -i 
= Bo(e™ + eM) + 57 filet — e-™). 


For the heat equation, with | = 1, we get 
(9) fy =e" fo. 


In both cases, it is clear that the solution-transforms 7% decay, as | n| — 00, at least as 
rapidly as the data-transforms f, since the exponential multipliers e*” and e"* are 
bounded by 1. Now, assuming that the data f are infinitely differentiable, with all 
derivatives in L,, it follows from (4) that the data-transforms f decay more rapidly 
than 1 | ~" for any r. Then ay and ty also decay more rapidly than ln \7". From this 
it follows that the inverse transform (3b) converges uniformly and absolutely, and 
the solutions u, and uy exist and are infinitely differentiable, with all derivatives 


in Lp. 

On the other hand, what happens if we try to solve (2) for the potential equation, 
Op = a’ + b?? 

This time we get for @ the formula 
(10) ti, = 4e™ (fe +4) +te” fz — a), 


If the solution u exists, it is to be got from (10) by means of the inversion integra] 
(3b). But the factor e’”in the first term of (10) will certainly make the inverse transform 
integral diverge for t > 0, unless fy +f; /y decays as rapidly as e ™. This is a severe 
constraint on the choice of f,, given fp. On the other hand, if we prescribe only 
u(0, y) = f, (Dirichlet’s problem) then (6b) consists of only a single equation. It 
is satisfied by 


(11) fip = et"! f, 


which, for t 2 0, clearly has an inverse transform u. (The problem for t < 0 would be 
solved hy e!"!f,.) 


1973] HOW TO CLASSIFY DIFFERENTIAL POLYNOMIALS 645 


Thus we can see directly from examination of the transforms, without actually 
finding any of the solutions uy,uy or up, that Cauchy’s problem is indeed correct 
for a—b* and a? — b’, but not for a? + b*. These examples suggest that the number of 
conditions we can impose at t = 0 is equal to the number of exponential terms in (7) 
which are bounded as \1 | —» oo. Altogether there are / terms, as many as the order in 
d/dt of L. In Cauchy’s problem we have to satisfy ] conditions, so we need to have 
all of them well-behaved. We are led to 


DEFINITION 4. Q is correct in the sense of Petrowsky if, for all real n-tuples y, 
all roots t,(y) of Q(t,in) = 0 satisfy 


(12) Re t S$ M for some constant M. 


We claim that if Q is correct in the sense of Petrowsky, then Cauchy’s problem 
is correct for L. We need only show that @ as given by (7) exists and decays more 
rapidly than I" \-" for any r. (The data f are assumed to be infinitely differentiable 
with compact support.) 

As to the existence of a, all we need to do is find the y, 7) by solving the linear 
algebraic equations, 


k 
(=-) ( 2 rnt'e Veo — lf for k=0,---,;m—1. 


If the roots t are distinct, so that no powers of t actually appear in front of the 
exponential, then the coefficient matrix of this system is the familiar Vandermonde 
matrix, which is known to be nonsingular so long as t; #1, for j # k. In case of 
multiple roots, we get a closely related ‘‘generalized Vandermonde matrix.’’ The 
reader who has never seen it done will find it an enjoyable exercise in old-fashioned 
advanced algebra to show that in this case also the matrix is non-singular. This takes 
care of the first point, existence of a for all finite 7. 

To show that @ decays rapidly at \n| — oo, we have to consider the dependence 
on of these coefficients y,,(7) whose existence we have just established. Since they 
are solutions of a system of linear equations whose coefficients are powers of and 
whose right sides are the data-transforms f(y), Cramer’s rule for solving linear 
equations shows that y,,(#) is less than some polynomial in (y) times sup, | fn |. 

If f; is in CO, f,(n) is O(| \)~" for any n. The same is then true of y,.(y), and 
therefore, by virtue of (7) and (12), of i as well, as we claimed. 

It is natural to ask whether the converse is also true, whether it is necessary for Q 
to be Petrowsky-correct in order for L to be correct for Cauchy’s problem. The 
answer is yes. This is not obvious on the face of it. From (7) it would seem that all 
one needs for existence of u is 


(13) Re 1,(7) S Mlog|n|. 


646 REUBEN HERSH [June-July 


But it can be shown that (13) and (12) are equivalent. This follows from an important 
algebraic lemma: 


LeMMA. The function A(z) defined by 
A = sup Rezt,y), Q(t, in) =0 


In| =z 
is a piecewise algebraic function of z. 


This lemma is a special case of a famous theorem of Tarski on the decidability 
of the elementary theory of real polynomials. 

Tarski’s theorem says that one can eliminate (solve for) some of the variables from 
a given system of real polynomial equations and inequalities, provided the remaining 
variables satisfy one of a finite number of finite systems of polynomial equations and 
inequalities. A short and elegant proof of Tarski’s theorem has been given by Paul 
Cohen [10]. A full discussion, with an exposition of Seidenberg’s proof of Tarski’s 
theorem, is in [8]. 

Here we simply point out the implications of the lemma for our classification 
problem. Near | z | = 0, A(z) = A(| n ) is equal to one of a finite collection of al- 
gebraic functions of 1 . Each of these algebraic functions of one variable has a 
Puiseux expansion in fractional powers of | n |. (No such expansion exists for algebraic 
functions of two or more variables. It is precisely to make use of this one-variable 
result that the lemma is necessary.) We conclude that if any of the roots violates (12), 
then it must go to +00 like some positive fractional power of | , and so it must 
also violate (13). Thus we see that Petrowsky-correctness of Q is indeed equivalent 
to intrinsic correctness of L. 


3. Parabolic equations. Next we define “‘parabolic.’? We mentioned earlier that 
uy, the solution of Cauchy’s problem for the heat equation, is infinitely differentiable 
even for singular initia] data. We now can read off this conclusion from (11). Indeed, 
if f, is in Ly, then so is fy. The factor e~"* makes a(t, n) decay more rapidly than any 
negative power of ly , which means that u(t, y) is infinitely differentiable. 

This pleasant state of affairs comes about because the root t, of the equation 
t — (in)? =0 goes to — co as ln | — oo. We are thereby led to 


DEFINITION 5. Q is formally parabolic if all roots t; of Q(z, in) = 0 satisfy 
(14) Ret — — co as |m| - co through real values of 7. 


We want to prove that if Q is formally parabolic, then L is intrinsically parabolic. 
Since formal parabolicity implies Petrowsky-correctness, we need only prove that 
u(t, y) is infinitely differentiable even for arbitrary L, initial data. We again refer to 
(7). We have seen that y,,(y7) grow at most like some power of | n . We need to show 
that a decays faster than | y|~" for arbitrary n. Now, by the same algebraic argument 
we used above, one can show that if Re t, goes to — o as In| — oo, then for some 


1973} HOW TO CLASSIFY DIFFERENTIAL POLYNOMIALS 647 


positive constants « and f, Ret, S$ — a} n |’. This estimate does guarantee that # 
decays faster thany| n \~" for any r, and so wu is infinitely differentiable, as we wanted 
to prove. 

To prove the converse, one shows that if some root t of Q(t,in) =0 fails to 
satisfy (14), then, by an appropriate choice of data f, one can construct a solution 
u(t, y) which is not C®, that is, whose transform i does not decay like | 7 | ~"for some n. 

It should be emphasized that in this definition, unlike a usage sometimes found 
in the older literature, an equation by no means need be parabolic merely because it is 
of lower order in some variables than in others. For example, the equation u, = iu,, 
is not parabolic, as the reader may verify. 

From the intrinsic form of the definition, it is immediate, without any computa- 
tion, that any smooth invertible transformation of the y-variables takes a parabolic 
equation into a parabolic equation. The same is true for any smooth transformation 
of the closed half-line 0 < t < oo onto itself. 

If Q is parabolic, then it is actually the case that any solution of Lu = 0 is in- 
finitely differentiable, even if u is defined only on some open subset of t—y space. 
This property is called ‘‘hypoellipticity.’’ A famous result of Hormander is that L is 
hypoelliptic if and only if, for all complex 4, Q(y) = Oand | 7 | — co imply | Rey | >. 
(Here, unlike Definition 5, there is no ‘‘distinguished’’ variable t. All the variables 
are treated as equals.) 


4. Hyperbolic equations. Next, we look for an algebraic criterion for hyperbol- 
icity. We must find a property of @ which is related to finite signal speed of u in the 
same way that rapid decay of @ is related to smoothness of u. This need is precisely 
filled by the theorem of Paley and Wiener. 

Suppose the data fin (7) vanish for | y| > M. Then the domain of integration in 
(3) is a bounded set, and it is clear that the integral converges even if 7 is complex. 
Moreover, the resulting function f(y) is differentiable, for all real or complex n, 
and it satisfies the obvious estimate, 


(15) lfm) | < const eM lal, 


The constant is just the maximum value of | f(y) . In brief, f is entire analytic, of 
exponential type M. For real y, f is of course in Ly. 

The Paley-Wiener theorem says that, conversely, if f is entire, satisfies (15), and 
is square-integrable for real 7, then f(y) = 0 for || > M. 
“ A more precise version has been given by Plancherel and Polya. Given a closed 
bounded set @ in real y-space, the “‘support function’’ of F is defined by s(y) 
= MaXxingX' y. If f is square-integrable for real y, and if f is entire analytic and 
satisfies, for all complex y, 


|f(n)| S const exp| s(Im 7) | 


then f vanishes outside %. 


648 REUBEN HERSH [June-July 


The idea of the proof is simple. One considers an arbitrary y not in J, and 
chooses w so that y-w>s(w). In the inversion formula (3b), since the integrand is now 
entire analytic, we may shift the path of integration into complex y-space, taking 
n+kw, — 0o<y< o, as the new path of integration. By Cauchy’s theorem, the 
value of the integral is unchanged. If k is very large, | f | will be as small as we please. 

Let usapply this theorem to the wave equation. First we make the obvious remark 
that, if two functions are entire of exponential type M, and M,, then their product 
is also entire, of exponential type M, + M,. Now, if the data fp and f, have support 
in | y| < M, then, by the converse of the Paley-Wiener theorem, their transforms 
fy and f, are entire functions of exponential type M. Since cos yt and sin nt /y are, as 
functions of y, entire of exponential type t, we see from formula (8) that i, is entire 
of exponential type M + t, and so by Paley-Wiener, u,,, the solution of Cauchy’s 
problem for the wave equation, vanishes for | y| > M +t. That is, the wave equa- 
tion has finite speed of propagation. 

For the general case, we have to look at (7). Now we suppose that f,(y) =0 for 
| y| > M, so all the data-transforms f, are entire of exponential type M. It follows, 
without any additional hypotheses, that # is an entire analytic function of 7. To see 
this, we recall that by our previous analysis, the functions y,,(y) are equal to linear 
combinations of f,, with coefficients which are rational functions of 4 and of the 
roots t,(y). Now, the roots t, are multivalued algebraic functions of 4, so there seems 
to be a question about the analyticity of @ at branch points of t. But even though 
each root t, is multivalued, # 1s continuous and singlevalued as a function of real or 
complex y; indeed, i is the solution of the ordinary differential equation (6a, b), and 
so is single-valued and depends continuously on the parameter y. Since 7@ is con- 
tinuous and single-valued for all complex y, and in particular, at the branch points of 
(ny), it is entire analytic in each y, separately, and therefore it is an entire analytic 
function of the n-tuple (y,,---,7,,). 

What is needed is an extra hypothesis on Q to make # a function of exponential 


type. 


DEFINITION 6. We say Q is formally hyperbolic if it is correct in the sense of 
Petrowsky and if, in addition, the roots t(y) of Q(t, in) = 0 satisfy, for all complex n, 
| Re 1(n) | < Coln| + C,, for some constants Cy and C,. 


If Q satisfies Definition 6, and if the data-transforms are entire of exponential 
type M, ther i is entire of exponential type tC, + M. The same reasoning as in the 
special case of the wave equation now shows that if Q is formally hyperbolic, L is 
intrinsically hyperbolic, with signal speed not greater than Cy. The converse is also 
true, and is proved with the help of our algebraic lemma. 

An important example of a hyperbolic equation is the first-order equation, 


Ou 


a; ay, 


(16) Qu=u,- & 


1973} HOW TO CLASSIFY DIFFERENTIAL POLYNOMIALS 649 


If we let u be a vectorand a, be symmetric matrices, (16) becomes a symmetric 
hyperbolic system of equations. It is hyperbolic because the roots t(y) of det Q(t, in) 
= 0 are simply the eigenvalues of the skew-Hermitian matrix i Xnja;, so they have 
zero real part and grow like a first power of |,| for each j. Maxwell’s equations in 
vacuo are a famous example of a symmetric hyperbolic system. 

If the degree of Q(a, b) as a polynomial in a and b is the same as /, its degree in a 
alone, one says that Q is ‘‘non-characteristic’’ for the initial plane t = 0. By comparing 
Definitions (4) and (6), it is easy to see that Q is formally hyperbolic if and only if it 
is noncharacteristic and Petrowsky correct. In view of the equivalence we have 
shown between the formal and intrinsic properties, this means that for non-charac- 
teristic polynomials, correctness for Cauchy’s problem (in the sense of Definition 1) 
implies finite signal speed. (By contrast, the heat equation and other parabolic 
equations are characteristic for t.) 

On the other hand, the finite signal speed property actually implies that Cauchy’s 
problem is correct, even for initial data which grow at | y| — oo with arbitrary 
rapidity. To prove this, one writes the initia] data as a sum of terms with support in 
the cells, j S y; <j + 1. To each term in this expansion there corresponds a term in 
the solution; each term has compact support and can be obtained by the Fourier 
transform method. 

By the finite signal speed property, all but a finite number of these solution 
terms must vanish for any given value of t, in any cell j < y; <j + 1. Therefore, the 
sum exists and is the solution to the given Cauchy problem with arbitrary unbounded 
data. 

An interesting difference between hyperbolicity and parabolicity appears if we 
consider problems in a region with boundary. Suppose that Lu = 0 is satisfied only 
fort >0, y, > 0, and on y, =0, t> 0, wu satisfies some set of boundary conditions, 
B,d/dt, d/dy)u =0. If L is parabolic, then so long as a solution exists, it will be 
smooth in the interior, (t > 0, y, >0); the choice of boundary operators cannot 
overcome the smoothing property of L. But if L is hyperbolic, the finite signal speed 
of L may be lost if surface waves are propagated with infinite speed along the bound- 
ary y, =0. 

The simplest example is to take 


Lu = uy — Uy,y, —Uy,y, =AOin t>0, y,>0 


yriyi Y2y2 


with the single boundary condition 


(17) Bu =u, —u,,,, =0 on y, =0. 


y2y2 


We also need, of course, initial values, u(0, yi, y.) =fo (V1, 2), U0, V1, V2) =S; 
(4, ¥2) Subject to the compatibility condition f,(0, y,) = 6? /dy3 fo(0, y2). 

Since the boundary condition involves only tangential derivatives, one can find 
the solution u on the boundary t > 0, y, =0, by solving Cauchy’s problem for the 


650 REUBEN HERSH [June-July 


heat equation (17) with initial value f,(0, y,). Thenwis determined in the interior, 
t>0, y, > 0, by solving a standard boundary value problem, where uw itself is now 
known on the boundary y, = 0. Now even if the initial data vanish except near the 
origin, y,;= y, = 0, it is clear that because the boundary operator B is not hyperbolic, 
the solution u will in general be non-zero for arbitrarily large y, if y, < t. Thus the 
finite speed of propagation is lost. 

To guarantee finite speed of propagation for a mixed initial-boundary value 
problem, one needs to impose a hyperbolicity condition on the boundary operators 
B, as well as the interior operator L = Q(D). 

To find such a condition, one again uses an integral transform to reduce the 
partial differential equation to an ordinary differential equation. Fourier transform 
can again be applied in all the y, except y,, the variable which has a boundary. If we 
then use a Laplace transformation in t, we again end up with an ordinary differential 
equation, but now in y, instead of in t. 

Again the Paley- Wiener theorem yields the precise algebraic condition correspond- 
ing to finite signal speed (see [7]). A simple sufficient condition, in addition to unique 
solvability of the mixed initial boundary value problem, is that L is hyperbolic and B, 
is homogeneous as a polynomial in d/dt and d/dy. 

It should be mentioned that despite the false impression created by the traditional 
classification, Cauchy’s problem is correct for many operators which are neither 
parabolic nor hyperbolic. A few which have physical interest appear in the Schrédin- 
ger equation u, = iu,,, the vibrating beam equation u,, + u,,,, = 0, and the equation 
of viscous acoustics, 


yyyy 


Uy = 2Uyyy + Uyy. 


5. Elliptic equations. We now have to consider those operators for which 
Cauchy’s problem is not well posed. This means there are some ‘‘badly behaved’’ 
roots t(j) whose real part goes to + oo as |n| — oo through imaginary values. It is 
clear that reasoning identical to that we have used before would show that we can 
prescribe a number of initial conditions for u just equal to the number of “‘well 
behaved’’ roots t, of Q(t, in) = 0 that satisfy (12). 

This sheds some light on what is wrong with an ‘“‘ultra-hyperbolic’’ equation 
such as 

Ure = Uyry, — Uyoy2: 

For this example, the roots (7) have the form t.. = + (n— n7)*. As n, and n, 
go to infinity, Re tc, may remain bounded or may go to +00, depending on our 
path in the y,-y, plane. 

Thus, we are unable to prescribe in a natural way a definite number of conditions 
for u at t = 0. The number of conditions k should equal the number of roots 7,(y) 


1973] HOW TO CLASSIFY DIFFERENTIAL POLYNOMIALS 651 


satisfying (12), but this number is not well-defined. Shilov has met this difficulty by 
prescribing the initial conditions in terms of the transform 7; then indeed one can 
formulate appropriate initial conditions for every Q. Those Q for which k is in- 


dependent of 7 are called ‘‘regular’’ by Shilov. 
For arbitrary Q, not necessarily regular, and for each path @ which goes to oo in 


real n-space, we can define k(@) as the number of roots of 


Q(t, in) = 0 which satisfy Re t(y) < C on @. 


Then there will always be two integers, 
k, = ming k(0) and k, = max, k(@). 


It takes only a moment’s reflection to see that k, is the number of conditions we can 
prescribe arbitrarily in L, at t = 0. On the other hand, if we have a solution u which 
satisfies k, prescribed conditions at t = 0, then its transform @ (and so uw itself) are 
thereby uniquely determined. Thus the ‘‘regular’’ case—the case k, = k,—is 
precisely the case where it is possible to obtain both existence and uniqueness by 
prescribing initial conditions in a manner independent of n—1i.e., in terms of the 
*“‘physical’’ variables t and y. 

By applying these remarks on “‘regular’’ boundary problems, we can generalize 
the half-space Dirichlet’s problem for Laplace’s equation. We will consider only real 
polynomials Q which are homogeneous of even order 2k. 

We now use x = 0 instead of t = 0 to specify our boundary. We suppose that 
Q(d /dx, d/dy) is a real, homogeneous polynomial of even order 2k; y = (y,,°-°, y,) 
is an n-vector for some n= 1. 


DEFINITION 7. We shall call Q intrinsically elliptic if we can solve Q(D)u = Lu = 0 
both in x > 0 and in x < 0, with k boundary conditions, 
Ou 


— =f, forr=0,---,k —1, 
Ox x=0 


for ‘‘arbitrary’’ f. in L,. (The reader should be warned that this definition is not 
standard, though it is consistent with the standard one. We shall discuss this point 
below.) 

To get a corresponding ‘‘formal’’ definition, we notice the most obvious property 
of the Laplace polynomial Q(a, b) = a? + b?—it is positive in the real (a, b)-plane 
except at the origin. 


DEFINITION 8. A homogeneous, real polynomial Q is formally elliptic if Q(a, b) 
#0 for real a,b not both zero. 

Because of the homogeneity of Q, the roots &(y) of Q(¢, in) = 0 satisfy &(sn) 
= sé(y) for all complex s. Furthermore, for real 7, €(y) is not pure imaginary, for 
then we would have Q(é, in) =iQ(é/i, n) = 0 with real €/i and y, violating the formal 


652 REUBEN HERSH [June-July 


ellipticity. Because the coefficients are real, the roots €/i of Q(é /i,1) = 0, which are 
all non-real, for real 7, consist of pairs of conjugate complex numbers. Counting 
them according to multiplicity, there must be k each in the upper and lower half 
plane, so that there are exactly k of the roots ¢(y) which satisfy (12), and k which 
satisfy the opposite inequality. These remarks are enough to show that there exists a 
unique L, solution to our generalized ‘‘Dirichlet problem’’ for either x >0 or 
x <0. That is, our version of “‘intrinsic ellipticity’’ follows from formal ellipticity. 


Now, the homogeneity implies 
En) = |n|é (i) so that |Reé|— 00, as || 00 for each root &(y). 
Y| 


Suppose y = (41,°°',4,) and y =(y4,-*', y,) are n-tuples and ny means Yin,y;. Then 
the n-fold iterated integral 


00 


1 
— —iny J, xSk(n) 
Ux) = oy | OE Hilde dn 


can be written as 


[ xy y(n) exp( —iny+|n| (77) dy. 


j,k (27)" —-~ 

Suppose, for definiteness, we are interested in the case x > 0. Then we choose 
€, so that Re ¢, < 0. Since Re €, (y/ | 7 1) is then a continuous negative function on a 
compact set (the unit sphere in real y-space), it assumes a negative maximum. We can 
therefore choose a number A independent of k, such that Re €, (y I|n 1) <A <0O, 
for all real 7 and for all the €, with negative real parts. Then it is easy to see that the 
integral for u(x, y) converges uniformly for complex x and y such that | Im y| 
< Rex/A, so that, as in the classical second-order case, u(x, y) is real-analytic in 
both x and y. 

I. G. Petrowsky proved (1938) that if Q(d/dx,d/dy) is formally elliptic, then all 
solutions of Lu = 0 are analytic (not just those defined in a half-space x > 0, which 
we are considering). This property is the one usually taken as the definition of 
intrinsic ellipticity. We have deviated from this tradition for the sake of simplicity 


and brevity. 
We are now in a position to tie together the notions of parabolic and elliptic. 


Consider Q(d /dt,d/dy) = d/dt — Q)(d/dy), where Q, is homogeneous of order 2k, 
real arid elliptic, and y is an n-tuple, n = 2. Then the roots t(y) of Q(t, in) = 0 are 


just 


Q,(in) = i?*Qo(n) 
— 1)" n 740, | 
(— 1)*|n| 00/77) 


7(n) 


1973] HOW TO CLASSIFY DIFFERENTIAL POLYNOMIALS 653 


Now Q, # 0 on the unit y-sphere, and since y, like y, is an n-tuple, n 2 2, the unit 
sphere is a compact connected set. It follows that either 


Oo (i) s-c<0 or Qo (elke 


for all real 7. We can strengthen our definition of formal ellipticity to require 
sen Q,(y) = (— 1)*** for real n. 


Then it is clear that Ret(y) = Qo(in) goes to — oo as || —+ oo through real values, 
and d/dt — Q,(d/dy) is parabolic. 

This discussion of problems with constant coefficients in a half-space is, of course, 
only a bare introduction to some of the concepts and problems of linear differential 
operators. It is natural, for example, to ask what happens if our domain is more 
complicated, or if our equation has variable coefficients. Roughly speaking, one 
might hope that if the coefficients do not vary too wildly, then an equation with 
variable coefficients is intrinsically parabolic (or hyperbolic or elliptic) if it is formally 
of that type at each point. This expectation turns out to be pretty generally true. The 
elliptic case is the best known, and the one for which the least regularity of the coeffi- 
cients need be assumed. (Strong existence and regularity theorems have been ob- 
tained where the coefficients are no more than measurable as functions of position!) 
Usually some sort of uniformity must be imposed in the ellipticity (or parabolicity 
or hyperbolicity) of the variable coefficients. Then it is often possible to treat variable 
coefficient problems as perturbations of constant-coefficient problems. 

A different direction in which one can generalize the classification problem is 
to replace the differential operator d/dy by an abstract operator A. If A generates 
a group, it turns out, rather unexpectedly, that the classification of polynomials 
Q(d/dt,A) can be reduced to the classical case Q(d/dt,d/dy). (See [9].) 

Two standard references on linear differential operators are Hormander [1] and 
Gelfand and Shilov [2], especially volume 3. The book by Shilov [3] has an ex- 
tensive discussion on half-space problems. The books by Treves [4] and Friedman 
[5] should also be consulted. For hyperbolic equations Lax’s notes [6] are 
outstanding. 


References 


1. L. Hormander, Linear Partial Differential Operators, Academic Press, New York, 1963. 

2. I. M. Gel’fand and G. E. Shilov, Generalized Functions, (Vol. 3), Academic Press, New 
York, 1967. 

3. G.E. Shilov, Generalized Functions and Partial Differential Equations, Gordon and Breach, 
New York 1966. 

4. F. Treves, Linear Partial Differential Equations with Constant Coefficients, Gordon and 
Breach, New York, 1966. 

5. A. Friedman, Generalized Functions and Partial Differential Equations, Prentice-Hall, 1963. 

6. Peter Lax, The Theory of Hyperbolic Equations, Lectures Notes, Stanford University. 


654 GERALD LEIBOWITZ [June-July 


7. R. Hersh, Boundary conditions for equations of evolution, Arch. Rational Mech. Anal. 


Vol. 16, No. 4. (1964) 243-264. 
8. E. A. Gorin, Asymptotic properties of polynomials and algebraic functions of several variables, 


Uspekhi Mat. Nauk (N.S.) 16, No. 1, 93-119. Russian Mathematical Surveys, 1961. 
9. R. Hersh, Explicit solution of a class of higher-order abstract Cauchy problems, J. Differential 


Equations, 8 (1970) 570-579. 
10. Paul J. Cohen, Decision procedures for real and p-adic fields, Comm. Pure Appl. Math., 22 


(1969) 131-151. 


THE CESARO OPERATORS AND THEIR GENERALIZATIONS: 
EXAMPLES IN INFINITE-DIMENSIONAL LINEAR ANALYSIS 


GERALD LEIBOWITZ, University of Connecticut 


In this article I shall describe some recent results concerning a class of linear 
transformations which were originally studied many years ago by the great Felix 
Hausdorff from the point of view of summability theory (that is, as generalized 
averaging processes) and which have come to be called Hausdorff transformations. 
I shall also indicate some of the methods used in obtaining these results. 

The Hausdorff transformations, of which the familiar methods of Cesaro means 
are the simplest examples, provide a good illustration of the extent to which linear 
operators on infinite-dimensional spaces share some of the properties of their finite- 
dimensional counterparts and of the extent to which their behavior can be very 
different. We shall see an interesting interplay among concepts and techniques from 
basic linear algebra and calculus, classical analysis, and abstract analysis. 


1. Preliminaries. Recall that a Banach space is a normed linear space X in which 
every Cauchy sequence is convergent, or equivalently, in which every absolutely 
summable series is summable (= | Xn | < oo implies that x, exists in X). We shall 
be concerned primarily with the Banach spaces defined as follows. Let p be a real 
number greater than 1. The space |? consists of all sequences s = {s,: n = 0,1,2,---} 
such that ||s||,=(2|s,|?)'”” is finite. The space 17(0,1) consists of all Lebesgue 
measurable functions f on the unit interval for which | f |? is integrable, with 
( fo |f|?)'/? as norm, and the space L?(0, oo) is the analogous space of functions on the 
positive real axis. (Throughout this article, all sequences and functions are taken as 
complex-valued, and ‘‘scalar’’ will mean ‘‘complex number’’.) 

By an operator on a Banach space X we shall mean a continuous linear trans- 


Gerald Leibowitz received his M. I. T. doctoral degree under Kenneth Hoffman. He held a posi- 
tion at Northwestern University before assuming his present post. He was Associate Director of 
CUPM in 1968/69 and is presently a member of the Consultants Bureau. His work on functional 
analysis includes Lectures on Complex Function Algebras (Scott, Foresman and Co., 1970). Editor. 


1973] CESARO OPERATORS AND THEIR GENERALIZATIONS 655 


formation from X into itself. The operators on X again form a Banach space if the 
relevant structures are defined as follows: 
(S + S2)(x) = Sx + Sx, (wS)(x) = a: Sx, || S|] = sup {|| Sx]: |x|] s 1}. 

Associated with each operator S are two subsets of the complex plane. The 
resolvent set for S is the set p(S) consisting of all scalars A such that AI — S is in- 
vertible, where I is the identity transformation on X. (Invertibility here means that 
the operator is one-to-one and has full range; continuity of the inverse transformation 
then follows from Banach’s closed graph theorem. See [13, p. 236].) The spectrum 
of S, denoted by a(S), is the complement of p(S). 

The spectrum of an operator on X is a generalization of the corresponding 
finite-dimensional notion. Indeed, if dim X < oo, then a(S) is precisely the set of 
eigenvalues of S. However, if X is of infinite dimension, an operator need not have 
any eigenvalues, and if it does, they need not exhaust its spectrum. (We shall see 
examples below.) Much is known about spectra of operators. First, if || > | S ||, 
then 1247 *~"S" converges in the operator norm to the inverse of AI — S, hence a(S) 
is contained in the disk with radius || S || centered at the origin. Moreover, the inverse 
of an operator is a continuous function of the operator (see [13], p. 306), so p(S) is an 
open set. Thus the spectrum of every operator is closed and bounded, hence compact. 
An application of Liouville’s theorem to the operator-valued analytic function 
(AI — S)~* reveals that p(S) is a proper subset of the plane, so a(S) is not empty. 
(On the other hand, each nonvoid compact set of scalars is the spectrum of some 
operator.) 

We end this section with a brief discussion of dual spaces and operator adjoints. 
Given a Banach space X, the dual space X* consists of all continuous linear functions 
from X to the scalar field, furnished with the following structures: 


(> + W)(x) = $(x) + W(x), (a) (x) = a(x), | ol] = sup {| d(x) |: | x || S 1}. 


X* is again a Banach space. (One can identify the duals of 1?, L7(0,1), and L?(0, oo) 
with the spaces /?, 17(0,1), L(0, 0c) where 1/p +1/q =1. The dualities are defined 
as follows: <s,t> = Ls,t,, s in I?, t in 4; ¢f,g> = [f(x)g(x) dx, f in L?, g in LW?) 
Each operator S on X determines an operator S* on X*, the adjoint of S, given by 
(S*d) (x) = d(Sx). The mapping S — S* is linear, preserves operator norms, and is 
anti-multiplicative: (S,S,)*=S*S; From this it is clear that p(S) is contained in 
p(S*), or equivalently that the spectrum of S* is a subset of the spectrum of S. If, 
moreover, each continuous linear functional on X* is given by evaluation at some 
member of X (in this case one writes X = X** and says that the space X is reflexive), 
then S and S* have the same spectrum. This is an important remark for our study 
below since we shall determine the spectra of certain operators in two stages: in the 
first it is shown that one of S or S* has a certain large set of eigenvalues; in the second 
it is shown that for structural reasons, the spectrum cannot contain any scalars 
outside the closure of the eigenvalue set. 


656 GERALD LEIBOWITZ (June-July 


2. Cesaro operators. Associated with each sequence s = {s,} of complex numbers 
is its sequence of arithmetic means {z,} given by 


TM, = (So +5, +++ +5,)/(n + 1), n=0,1,2,---. 


The basic theorem of Hélder and Cesaro asserts that if s, converges to a limit s,, 
then z,—5,, as well. Any sequence-to-sequence transformation which preserves 
convergence and limits of all convergent sequences is said to be a regular method 
of summability. In the next section we shall consider the properties of a large class 
of summability methods — which contains many of the best-known regular methods — 
using the Hélder-Cesaro operator as our model. (For information about summability 
theory one may consult [3] and [6].) 

The transformation Cy which associates with {s,} the sequence {z,} can be 
thought of as a linear transformation on the space E® of all scalar-valued sequences. 
By analogy with the finite-dimensional situation, one would expect Cy to have a 
countably infinite family of eigenvalues, and indeed it does. For if m is an arbitrary 
nonnegative integer and s) is the sequence with entries si) =QOforOsn<m, 


sO) = (” ") for k =0,1,2,-, then Cys™ = 3". 
Moreover, it is not difficult to prove that the numbers (m + 1)~* are the only eigen- 
values of Cy and that each characteristic subspace is of dimension 1. We should note 
in passing that none of the eigenvectors for Cy is a bounded sequence. 
For each p, |? is a linear subspace of E~, and one may ask whether C, takes |? 
into itself. That it does is a consequence of Hardy’s inequality for sums, which 
asserts that 


X| 2,|? S q?L|s,|? 


and that the constant q? cannot be replaced by any smaller number. (For a proof, 
see [7], p. 239.) In terms of operators we can state: the restriction of C, to /? is an 
operator of norm q. Moreover, the restricted operator, which we shall continue to 
call Cy, has no eigenvalues! Since the spectrum is the infinite-dimensional substitute 
for the set of eigenvalues, we are led to wonder: what is o (Co, /?)? 

Functions of a continuous variable can also be averaged. If fis measurable and is 
integrable over each subinterval (0, x), then 


(1) Fo) = = | soa 


is defined and continuous. G. H. Hardy proved the following inequalities, which 
are analogous to his inequality for sums (see [7], p. 240): 


i) LF(x)|Pdx << q? I fc |Pat, 


1973] CESARO OPERATORS AND THEIR GENERALIZATIONS 657 


‘FO |Pdx <q? [ “lr Pat, 


and once again the constants are best possible. Thus there are operators C, on 
I?(0, 1) (the finite-range Cesaro integral operator) and C,, on I?(0, 00) (the infinite- 
range Cesaro integral operator) which associate with each f in the given function 
space the function F given by (1). According to the inequalities, || C, ! =q and 
|C.. | = q. Again the questions of spectra arise. 

The eigenvalue problems are easily disposed of. Continuity of F implies, by the 
fundamental theorem of calculus, that any eigenvector must be continuously dif- 
ferentiable and must be a solution of the Euler differential equation A(xy)’ = y. (The 
possibility that 0 might be an eigenvalue must be dealt with separately.) The solutions 
are the constant multiples of x’ where B = 4~!— 1. None of these lie in L?(0, 00) 
and so C,, has no eigenvalues. On the other hand, since x’ belongs to L7(0, 1) if and 
only if the real part of p’ exceeds — 1, the set of eigenvalues of C, is the entire open 
disk {A: Re(A7") > q7"}. 

The spectral facts for the operators Cy, C,, C,, can be summarized as follows. 


THEOREM 1. (a) o(Co,/?) = {A: Re(A~") 2 q7"} U {0} 
{A:|4—4/2| S$ q/2}. 

(b) o(C,, L7(0, 1)) = o(Co, I”). 

(c) o(C.,, 1°, 00)) = {4: Re(A-) = q7#} U {0}. 


In order to prove (c), one exhibits specific formulas for the inverse of AI — C,, 
for A exterior to the circle and for A interior to the circle. For example, if | A—-q /2| 
> q/2, then A~!I + A~*P, is the inverse of AI — C,,, where 


(P,f)(x) = | fxtyt7 at, 


One then shows that no point A, on the circle belongs to the resolvent set because 
|| P,|| diverges to infinity as 4 approaches A). The arguments for (a) and (b) are 
symmetric. Just as the interior of the disk |a —q /2| < q/2 consists of eigenvalues 
for C,, the same open disk also consists of eigenvalues for C5; so both spectra contain 
the closed disk. On the other hand, if 4 is exterior to the disk, the restriction, or 
rather the compression, of A~'I + A~?P, to [7(0,1) inverts AI — C, and a sequence- 
to-sequence analogue of that operator inverts AI — Cy. (For proofs see [1] and [11]. 
Note that C, acting on /? has no eigenvalues, yet its adjoint has uncountably many 
eigenvalues — certainly not what one would naively guess about so simple an oper- 
ator.) 

Since q = p/(p — 1), the influence of the space I? on the spectrum of Cg is clearly 
seen, but one may ask what the disk in (a) has to do with the numbers 1 /(n + 1) 
which enter in the definition of C,. We shall find the answer below. 


658 GERALD LEIBOWITZ (June-July 


3. Hausdorff operators. Consider an arbitrary complex-valued measurable func- 
tion k on the unit interval which satisfies the following integrability condition 


(2) fom k(t)| dt < 0c. 


Associated with k are three linear operators which we shall call the Hausdorff 
operators determined by k. (That the transformations are indeed operators on the 
indicated spaces was proved by Hardy; see [6, Chapter XI].) 

The discrete operator S, = S,(k) is defined by the formula 


(3) (Sos), = y ("" KnmSm (Sin I’), where 
@] mM 


m= 


(4) Knwm = [ (1 — 1)" ™™k(t)dt. 


The finite-range operator S, = S,(k) is defined by 


(5) (S,f)(x) = { fOxk()dt = (0<x <1, fin LO,D), 


while the infinite-range operator is given by 


(6) (S..f)(x) = [ fOxt)k()dt (<x <0, fin (0,0). 


Note that the kernel k,(t)=1 yields the Cesaro operators Cy,C,,C,,. The 
inverting operator P, mentioned in the proof of Theorem 1 corresponds to k(t) = t71/. 
Various choices of the kernel k yield other sequence-to-sequence operators and 
integral transforms which are well known to workers in summability theory. Let us 
list some of these here: 


Cesaro means of order « k(t) = a1 — 7’ 
6 — 1 —1\a-1 
Holder means of order « k(t) = Tn (log t~*) 
Gamma means I% k(t) = a" (og 17 )*7! 
; T(a) 
. . — Tata) 4-1 “4. 
generalized Cesaro means C,,, k(t) = (ar ft —**'; 


note in particular that ['{ corresponds to the kernel k,(t) = at*~1. (The corres- 
ponding integral operators take the form 


FAx) = = [ w7ndu 


1973] CESARO OPERATORS AND THEIR GENERALIZATIONS 659 


If the parameters are assumed to satisfy the conditions a>1/p,a>0, then the 
integrability condition (2) is satisfied, and indeed, in each of the examples a 
strengthened condition prevails: 


1 
(2*) [ ad k(t)| at < oo, for some y> 0. 
0 


Moreover, as one would expect from an averaging process, each summability kernel 
in the list is nonnegative and has total integral 1. (One may replace the absolutely 
continuous signed measures k(1)dt by more general measures, but we shall not 
discuss the generalizations.) 

If we forget the integrability conditions for the moment and consider any discrete 
Hausdorff transformation defined on E® by (3) and (4), where k is merely integrable 
over (0,1), we find a curious result: The eigenvectors of S,(k) are the same as those 
for Cy; indeed, Sys” = y,,s°” for each nonnegative integer m, where 


1 
Ln = | t"k(t)dt 
0 


is the mth moment of k. This leads one to jump to the conclusion that the spectral 
facts for Sp,S,, and S,, should be the same, in some sense, as those for C,,C,, and 
C.,. And in a strong sense this is in fact so. 


THEOREM 2. (a) Sy has no eigenvectors in I’, but if z is any scalar such that 
Re(z) > —1/p, then the sequence f, defined by 


(1—w)? = 2f(n)w" 


belongs to I4 and is an eigenvector of S>. The corresponding eigenvalue is K(z) 
= (ot7k(t)dt. The spectrum of So consists of 0 together with the range of K(z) on 
the half-plane Re(z) = —1/p. 

(b) If Re(z) > —1/p, then the function g, defined by g{t)= 1" belongs to 
L?(0, 1) and is an eigenvector of S, with corresponding eigenvalue K(z). The spectrum 
of S, is the same as the spectrum of So. 

(c) If k satisfies condition (2*), then the spectrum of S,, is the union of {0} with 
the range of K(z) on the line Re(z) = — 1/p. 


It 1s a consequence of the theorem that the spectrum of a Hausdorff operator is 
always a connected set containing the origin. The analytic geometry of the spectrum 
can be very intricate, as the reader may determine by calculating the moment function 
K for each of the examples. 

Since K,(z) = | o t?k,(t)dt = (z + 1)~!, Theorem 1 follows at once from Theorem 
2. Moreover, the numbers 1 /(n + 1) are just the moments K,(n). In general, the 
values of the moment function K, which the reader may perhaps recognize as a kind 
of Mellin transform of k on the half-plane or line with infinity adjoined, form the 


660 GERALD LEIBOWITZ [June-July 


spectrum while its values at the nonnegative integers are the diagonal entries 
Lm = Km.m i the infinite matrix representation of the discrete operator So. 

The facts about eigenvalues stated in Theorem 2 can be established by direct 
computation, once it is realized what they should be. The remaining assertions can 
be proved using facts about certain Banach algebras and groups of operators. These 
details will be published separately in [10]. We note also that D. W. Boyd has 
recently shown that conclusion (c) follows from the less stringent condition (2). See 
his forthcoming article entitled ‘‘Spectra of Convolution Operators.’’) 


4. Hilbert space considerations. Current interest in the Cesaro operators is due 
principally to the article [2] of Brown, Halmos, and Shields. These authors restrict 
themselves to the case where p = 2. Since /*, L?(0,1), and L?(0,00) derive their 
norms from inner products, one might expect that a more delicate structural theory 
would prevail for the Hausdorff transformations viewed as operators on these 
Hilbert spaces. To an extent the expectations are justified. 

Indeed, in [2] the following remarkable theorems are proved: 

(i) I — Cf is a simple unilateral shift operator on L?(0, 1); 

(ii) I- C* is a simple bilateral shift operator on L?(0, oo). 

(Shifts are defined as follows. If X is a separable Hilbert space, an operator S on 
X is a simple unilateral shift provided that there is a maximal orthonormal sequence 
{e,:n =0,1,2,---} in X such that Se, =e,,, for every n. An operator T on X is a 
simple bilateral shift provided that there exists a maximal orthonormal bi-sequence 
{e,;n =0,+1,+2,---} such that Te, =e,,, for every n. The story of the shift 
operators is elegantly told in [4] and [5].) 

The discrete operator cannot satisfy a condition analogous to (i) or (ii) since its 
adjoint has eigenvalues and shifts have none. The operator itself has some resemblance 
to a shift, but it is not one. Specifically, the following are true: 


(ii) I — Cy is unitarily equivalent to the operation f(z) — zf(z) on the closed 
subspace of L?(B) spanned by the polynomials, where B is a certain probability 
measure on the unit disk | z | <1; 


(iv) I — Cy is not similar to any weighted shift. (We recall that operators A and 
B are similar if A = PBP~* for some invertible operator P; if P can be chosen to be 
unitary, then A and B are unitarily equivalent. A weighted shift has the form SD 
where S is a shift and D is an operator which multiplies each e, by a scalar B,. See 
[8], [9]-) 

Theorems (i) and (ii) have been extended to the Gamma methods of order 1 (see 
[12]). The results read: 


(v) I —(2—a7")S,(k,))* is a simple unilateral shift on L-(0,1); 
(vi) Ud — (2 —a7")S,(k,))* is a simple bilateral shift on L?(0, 00). 


But one can generalize no further (see [10]): 


1973] A. A. ALBERT 661 


(vii) If k2=0, fok(Qdt=1, and [ot~*'/?k(t)dt < 0, then if for some c>0, 
either I — cS,(k)* or I — cS,,(k)* is a shift operator, then k =k, for some a> 4. 


Undoubtedly the summability kernels with « 4 1 will also turn out to be clas- 
sifiable according to structural properties of the associated operators. 

Some additional properties of Hausdorff operators on Hilbert spaces are known 
(for instance, S,, is always a normal operator and can be represented as the adjoint 
of a convolution operator on L’?(— oo, + 00) generated by an integrable function 
which vanishes outside (0, 00)) and more remain to be discovered. We hope that this 
article has convinced the reader that the algebraic and the classical points of view in 
analysis can each provide the other with a lively stimulus and a source of problems 
for study, and that ‘‘classical’’ need not mean ‘‘obsolete,’’ even in the era of ‘‘now!”’ 


References 


1. D. W. Boyd, The spectrum of the Cesaro operator, Acta Sci. Math., Szeged, 29 (1968) 31-34. 


2. A. Brown, P. R. Halmos, and A. L. Shields, Cesaro operators, Acta Sci. Math., Szeged, 26 
(1965) 125-137. 


3. R. G. Cooke, Infinite Matrices and Sequence Spaces, Macmillan, London, 1950. 

4. P. R. Halmos, Shifts on Hilbert spaces, J. Reine Angew. Math., 208 (1961) 102-112. 
5, ——-—, A Hilbert Space Problem Book, Van Nostrand, Princeton, N. J., 1967. 

6. G. H. Hardy, Divergent Series, Clarendon Press, Oxford, 1949, 


7. G.H. Hardy, J. E. Littlewood, and G. Polya, Inequalities, reprinted second edition, Cam- 
bridge University Press, 1967. 


8. T. L. Kriete and D. Trutt, The Cesaro operator in /2 is subnormal, Amer. J. Math., 93 (1971) 
215-225. 


9, ——— and —-—-—, On the Cesaro operator, to appear. 
10. G. Leibowitz, A convolution approach to Hausdorff integral operators, to appear. 


11. B. E. Rhoades, Spectra of some Hausdorff operators, Acta Sci. Math., Szeged, 32 (1971) 
91-100. 


12. N. K. Sharma, article to appear in Acta Sci. Math., Szeged. 


13. G. F. Simmons, Introduction to Topology and Modern Analysis, McGraw-Hill, New York. 
1963. 


A. A. ALBERT 


D. ZELINSKY, Northwestern University 


Abraham Adrian Albert died on June 6, 1972. The world lost a renowned mathe- 
matician, a vigorous force for the advancement of mathematics, and a very warm 
and understanding human being. From his birth to his death, he was associated 
with Chicago. As an inveterate traveller, he left that city often, for far parts of the 
world, but he always returned. He was born in Chicago on November 9, 1905, he 
went to school in Chicago (except for two years when his family moved to Iron 
Mountain, Michigan), he did all his undergraduate and graduate work at the Uni- 


662 D. ZELINSKY [June-July 


versity of Chicago. After receiving his Ph.D., he left for three years at Princeton 
and at Columbia Universities, then returned to the University of Chicago where 
he was a faculty member until the end of his life. With this as his base, he worked 
in many mathematical centers at various times in his career: The Institute for Ad- 
vanced Study in Princeton (1933-34), Universities of Brazil and Buenos Aires (1947), 
University of Southern California (1950), Yale University (1956-1957), University 
of California at Los Angeles (1958). He operated in Washington in many capacities, 
and in the International Mathematical Union. His most recent official trip was a 
visit to the USSR in 1971 as a guest of the Soviet Academy. 

To his friends Professor Albert was known as Adrian. Many mathematicians 
referred to him affectionately as A®. He was the son of a Jewish family that came to 
America from England. His father insisted on a Jewish but not very religious training. 
Albert distinguished himself early in his schools (Herzl and Marshall) on the West 
Side of Chicago, where the intellectual competition from the other budding scholars 
was keen. He spent four years earning his Bachelor’s degree at the University of 
Chicago, but one year later he had his Master’s degree, and a year after that, his 
Ph. D. In 1928, at age 22, his Ph. D. dissertation already stamped him as one of the 
outstanding algebraists of his day. 

Those were the days when the mathematical leaders at the University of Chicago 
were L. E. Dickson in algebra and E. H. Moore in general topology. Dickson was 
Albert’s thesis advisor and is the one mainly responsible for steering Albert into 
the subject of algebras over fields, which is the subject that primarily concerned 
him throughout his career. 

He was one of the early National Research Council Fellows (1928-29). This 
fellowship was the forerunner of the modern NSF Postdoctoral Fellowships (which 
unfortunately were discontinued recently) and has been held by some of the most 
famous American mathematicians. 

The precocity continued. At the age 35, Albert was promoted to a full professor- 
ship at the University of Chicago (at that time it was virtually unheard of to hold 
such a position before the age of 40). Two years later he was elected to membership 
in the National Academy of Sciences, a 37-year old academician. 

The list of other honors heaped on him, and of honorific duties he was asked to 
perform would run to more pages than this article. We mention just a sample: 
chairmanship (1958-1962) and deanship (1962-1971) at the University of Chicago, 
presidency of the American Mathematical Society (1965-66), trusteeship of the insti- 
tute for advanced Study (1969-72) chairmanship of the Internationay Mathematical 
Union’s organizing committee for the 1970 Congress in Nice, membership in the 
Brazilian and Argentine Academies of Sciences, several editorships, the Cole Prize in 
Algebra (1939), and three honorary degrees. He seemed to collect these honors with 
enthusiasm, and executed the duties with vigor. 

Although Albeit worked on matrix theory, on quadratic forms, and other as- 
pects of algebra, there is no question that his central interest was always the study 


1973] A. A. ALBERT 663 


of finite dimensional algebras over a field. In the old days, they were called hyper- 
complex systems. They are finite dimensional vector spaces with a multiplication 
that associates to every two vectors in the space another vector, the product. A sug- 
gestive example is the four-dimensional algebra of quaternions over the field of real 
numbers. The classical Wedderburn theorems essentially reduce the study of associ- 
ative algebras over a field to the classification of the division algebras (like the algebra 
of quaternions, for example). Over any field F, a four-dimensional division algebra 
with center F must be an algebra of “generalized quaternions’? whose multiplication 
rules are much like the ordinary quaternions: a basis |, i, 7, ij with ij =jiandi? =a 
and j” = B elements of F, which have no square roots in F (but are not necessarily — 1). 
If one wants to generalize to dimensions higher than 4 there are two candidates: 
the cyclic algebras and the still more general crossed product algebras. (A theorem 
asserts that, in any case, the dimension of any central division algebra is a perfect 
square.) Wedderburn had already proved that central division algebras of dimension 9 
ate all cyclic algebras. In Albert’s dissertation (1928) he proved that central division 
algebras of dimension 16 are not necessarily cyclic algebras, but are always crossed 
products. Although Albert’s theorem raised the obvious question about algebras 
of dimension 25, 36, etc., his result has stood without essential improvement or 
embellishment (though not for lack of trying) until some nice, complementary, but 
still not definitive results of Amitsur and others in 1971. 

This study put the young Albert in the center of what was to be one of the major 
breakthroughs in the theory of algebras: the determination of all central division 
algebras over the special field of rational numbers, or more generally over any alge- 
braic number field. In this case, it turns out that they are all cyclic algebras—this is the 
famous Hasse-Brauer-Noether Theorem (1931). An interesting article by Hasse 
and Albert in the Transactions of the American Mathematical Society (1932) traces 
the history of this theorem and relates the story of Albert’s near miss. On the basis 
of his results on algebras and some results announced by Hasse, Albert published 
some theorems that nearly proved the big theorem, and he wrote Hasse about it. 
Somehow the communication was bad, and the Brauer-Hasse-Noether manuscript 
was submitted for publication without mention of Albert’s independent contribu- 
tions. The 1932 Transactions article shows that in fact the big theorem follows from 
Albert’s results in just a few lines. 

Albert was hurt and disappointed by this incident. But the depth of that hurt 
could not compare with his feelings about the subsequent Nazi scourge which caused 
some important German mathematicians to begin distinguishing between “‘Aryan”’ 
and *‘Semitic”” mathematics, and which resulted in the exodus of so many German 
scientists, Jews and non-Jews alike, including both Richard Brauer and Emmy 
Noether. 

Albert was invited to be a member of the Institute for Advanced Study in Prince- 
ton during its opening year in 1933-34. (Another distinguished member, who arrived 
that year and remained on a pe1manent basis was Albert Einstein.) This contact 


664 D. ZELINSKY [June-July 


with Princeton was profitable for Albert. His associations with Lefschetz in par- 
ticular resulted in one of Albert’s mathematical accomplishments that he always 
regarded with greatest pleasure, and for which he later won the American Mathe- 
matical Society’s Cole Prize in algebra. Already in 1929, Lefschetz had interested 
Albert in a major unsolved problem in the theory of algebraic functions, Riemann 
surfaces and Abelian varieties. In a series of papers (1929-1934) Albert produced 
a definitive solution. What was required was a classification of the algebraic corre- 
spondences of a Riemann surface (automorphisms of a complex curve). This had 
been reduced to the problem of finding the matrices that commute with a certain 
“Riemann matrix” of periods of basic Abelian integrals on the Riemann surface. 
These commuting matrices form an algebra, and in the basic cases, a central simple 
algebra over the rational number field. This version of the problem was right in the 
center of Albert’s special expertise, and he demolished It. 

Later, he attacked the problem of general nonasgsociative algebras that are finite 
dimensional over a field. Almost single-handed he influenced a large number of 
young mathematicians to break this seemingly unpromising ground. Special algebras 
had been studied that were not associative but which obeyed axioms substituting 
for the associative law (Lie algebras, Jordan algebras, alternative algebras). Results 
like the Wedderburn theorems had been proved for som2 of them; in fact, the re- 
sults for Lie algebras over the complex number field were proved by E. Cartan 
before Wedderburn obtained his corresponding theorems for the associative algebras. 
But Albert had the idea of using associative algebra theory to prove analogs of the 
Wedderburn theorems for quite arbitrary nonassociative algebras (even a nonassocia- 
tive algebra has an associative ‘“‘regular representation”’ algebra). It is a sign of his 
genius that he was actually able to develop a reasonable theory and also significantly 
influence the theory and applications of the special algebras we mentioned. 

Albert’s style of algebra was almost inimitable. He had a diabolical facility with 
manipulation of identities--an enterprise in which most mathematicians founder, 
never being able to see the forest for the trees. Somehow, Albert could see through 
mazes of symbols to the inner workings of all those polynomials in several variables 
or multiplication tables of complicated algebras. 

Mathematics was Albert’s great enthusiasm. It was impossible to associate with 
him for any length of time without feeling the vigor with which he pursued his theorems. 
He was always willing to talk about his latest mathematical exploits. When his son, 
Alan, was still very young, Albert insisted on explaining even to him his latest the- 
orems, patiently describing the necessary ingredients to the intrigued schoolboy 
who had not yet formally seen any real mathematics. 

Albert was a prolific author of textbooks, research treatises, and more than a 
hundred and thirty research papers, the last of which is due to appear soon. 

A minor theme running through Albert’s life was his fascination with cameras, 
radios and other gadgets. I have always thought that this streak was responsible 
for his activity in the Applied Mathematics Group at Northwestern University 


1973] MATHEMATICAL NOTES 665 


during World War II and his association with the Rand Corporation (1951 and 1952), 
Southern California Applied Mathematics Project (1953-55; he was its chairman 
1959-60) and the Institute for Defense Analysis (director, Communication Research 
Division 1961-62, trustee 1969-72). His principal contribution in these activities 
was to cryptanalysis and coding. He was also a lifelong aficionado of detective stories, 
which he devoured at an enormous rate. 

To his friends, Professor Albert is remembered as a devoted family man. He 
married Frieda Davis in his second year of graduate study, and they shared a close 
relationship for all the subsequent forty-four years. They had two sons and a daughter 
and five grandchildren. (Tragically one son died of an illness at the age of 25.) Perhaps 
one should also count his 29 Ph. D. students whom he treated almost as members 
of his family. 

He was always pleased to use his influence in Washington to improve the status 
of mathematicians in general, and he was willing to do the same for individual 
mathematicians whom he considered worthy. One of the more homey causes to 
which he lent the weight of his reputation was retaining an apartment building at 
the University of Chicago for visiting mathematics faculty and their families. There 
are families throughout the world that remember this little mathematical microcosm 
with pleasure. 

Everyone who knew him will remember his vigorous but round, medium build, 
curly hair, and often boyish demeanor; but especially one must remember his great, 
pleased grin that he flashed to welcome news of new successes for any of his ex- 
tended family anywhere in the world of mathematics. 


MATHEMATICAL NOTES 
EpITED BY ROBERT GILMER 


Material for this Department should be sent to David Roselle, Department of Mathematics, 
Louisiana State University, Baton Rouge, LA 70803. 


FUNCTIONS SATISFYING A MEAN VALUE PROPERTY AT THEIR ZEROS 
D. P. STANFORD, College of William and Mary 


We ask whether a function on the plane whose average at its zeros is zero, and 
which assumes the value zero on an open set, must be zero everywhere. 

Specifically, let R denote the real numbers and for p in R? and r > 0, let D(p, r) 
denote the open disc of radius r centered at p. 


DEFINITION: A function F from R? to R has property Z provided [fpip,+ F = 0 
for all r > 0 whenever F(p) = 0. 


666 D. P. STANFORD (June-July 


We now ask: For what open sets U and what classes C of functions can we assert 
that if F is in C, F has property Z, and F = 0on U, then F = 0on R?? 

The question arises in the study of functions satisfying the weighted average 
property of A. K. Bose [1]; more specifically in trying to determine whether a non- 
constant function u, satisfying the weighted average property with respect to a positive 
continuous weight function w, could have local extreme values. (If w is in class C’, the 
answer is no, since u would have to satisfy the differential equation given by Bose.) 
If u(p) =O is a local extreme of u, then the function uw would have property Z 
and would be zero on an open set containing p. 


THEOREM. If F(x,y) =f(x)g(y) is continuous from R* to R, F has the property Z, 
and F =0ona set U open in R’, then F = 0 on R?. 


The proof of the theorem depends on the following lemma: 


LemMMA. Suppose T > 0, f and h are continuous in [0,2T], h(t) >0 for 
0<t<2T, and fof(w)hct —u)du=0 forO0 S$ tS2T. Then f(t) =0 for OS tS T. 


The lemma is easily proved as a variation on the ‘‘Proof of Titchmarsh’s theorem 
in the case f = g’’ on page 20 of [2]. 


Proof of the theorem: U contains a non-void open rectangle (a,b) x (c,d) on 
which F = 0. If there is an xq in (a,b) with f(x9) # 0, then 


F(x9,y) 
=" F%q) 

for all y in (c,d). Thus either f = 0 on (a, b) org = 0 on (c,d). We assume f = 0 
on (a, b), the other case being similar. 

Let (a', b') be the largest open interval (possibly infinite) containing (a, b) on 
which f = 0. We shall show that b' = co. If b'! < co we may, by a translation, suppose 
b1 =0, so that a' <0. Iff=0 org =0on R, the theorem follows. Thus we assume 
that f40 and g £0. It follows that f and g are continuous. Let g be a number with 
g(q) # 0. We assume g(q) > 0, the other case being similar. Let T > 0 such that 
g(t) > 0 for g -2T <t<q+2T and 4T < — a’. Let 


B 
[ soa, osx 547, 
h(x) = a 


0, x > 4T, 


where « = q — ,\/4T? — (x—2T)? and B = q + ./4T? — (x—2T)?. Then h is con- 
tinuous and, for 0 < x < 2T, the interval of integration in the definition of h(x) is 
contained in [gq —2T, gq +2T], so h(x) > 0. 

We suppose 0 < t S 2T and show that {6 f(u)h(t—u)du=0. Since 2T <— a! 


1973] MATHEMATICAL NOTES 667 


we have a! < t—2T < 0. Thus f(t—2T) = 0, so F(t — 2T, q) = 0. Now 
D((t—2T, q),2T) = {(x,y): t-—4T <x <tand 
q — J4T? —(t—x —2T)? < y <q +./4T? —(t— x — 2777}. 


Since f(x) = 0 for a! < x < 0 and since a’ < t —4T, we have 


°= [ncecoro.an” 
| | [s (u)g(y) ay} du 


[ fonn—w du, 
0) 


I 


H 


ns i et er ee 


the Lemma, f(t) = 0 for 0 < t < T. That is, (a',T) isan open interval on which f = 0 
which is larger than (a’, b'), contrary to our choice of (a',b'). Thus b* = oo. Simi- 
larly, a'= — o, andf = 0 on R. Thus F = 0 on R? and the theorem is proved. 


References 


1. A. K. Bose, Functions satisfying a weighted average property, Trans. Amer. Math. Soc., 118 
(1965) 472-487. 
2. Jan Mikusinski, Operational Calculus, Pergamon Press, New York, 1959. 


ON AN EXTENSION OF THE THEOREM OF HAUSDORFF-YOUNG 
LIANG-SHIN HAHN, University of New Mexico 


The Hausdorff-Young theorem [1, p. 98] asserts that if 1 < p < 2 and q is the 
conjugate exponent, that is, 1/p + 1/q = 1, then 


[ & | fo |*P" <|f| feL(T). 


Pp? 


If p = 1, the left-hand side is, by definition, SUD; ez f(b). (For notations, see 
Katznelson [1].) 

Recalling that the Taylor coefficients of functions which are analytic in the 
(open) unit disc and continuous on the closed unit disc are precisely the Fourier 
coefficients of the boundary functions, the dramatic breakdown of the Hausdorff- 
Young theorem when p > 2, as shown here, does not seem to have been formulated 
before. 

Let HC(D) be the Banach space of all functions analytic in the (open) unit disc 
and continuous on the closed unit disc, endowed with the uniform norm. 


668 LIANG-SHIN HAHN [June-July 


TuEoREM. For each function f(z) = Xg_-of(k)z* in HC(D), except perhaps 
those of a meager subset of HC(D), it is the case that 


>» | #(k) 2 = 00 for every q <2. 
k=0 


Proof. Let E aad E% (n = 1,2,3,::+) be the sets of all functions 
f(z) = Xe f(z in HC(D) such that ¥,2 | K(k) I" <oo for some g <2 and 
Lc 0| Kk) |* S n, respectively. Since E =\J2., Ein, where {q,}7_1 is a monotone 
increasing sequence of positive real numbers converging to 2, it is sufficient to show 
that each E? is nowhere dense (q < 2). The proof is carried out in two stages: 


1. The set E? is closed. If f, > f uniformly, then i(k) ~ fk) for all k (uniformly) 
as j > oo, hence for every positive integer M, 


M M 
Llf(H\t> Ulf’? Go). 
k=0 k=0 


Thus, if f;¢ E, (j = 1,2,3,---), then L,=0| Sk) ? Sn,and fe E?. 

2. Suppose f ¢HC(D). We want to show that given any ¢e > 0, there exists a 
function g in HC(D), but not in E%, such that | f-g | < sé. Since E* is closed, this 
will establish that Et is nowhere dense. Choose a polynomial g, such that 
! f-91 ! < ¢/2. Existence of such a polynomial is guaranteed by the Fejér theorem 
(and the maximum modulus principle). 

Next, we claim that given any 7 > 0 and 0 < € <1, there exists a polynomial h 
satisfying € < | A(z) | <1 for all |z| = 1, and max, | Atk) | <7. 

Once the existence of such a function is established, then since 


n n 22 
ZX | A(k)|t>nt-? ZX | Atk)? = yt? i; | | n(e'*) |2dt > 2-222, 

k=0 k=0 2m Jo 
(n = degh), 


(where in the second step, we have used the Parseval equality for the polynomial h), 
we may let g,(z) = (e/2)z°"h(z), where m = degg,, then g = g, +g, satisfies 
the condition prescribed above. Clearly, g ¢ HC(D), 


If-el s|s-si|+]el <S+$=s 


and 
wD m 2mt+n 
LleHl?= UHH P+ X | s(w|! 
k=0 k=0 k=2m 
= Elewmlt+ (3) = lol 
k=0 k=0 
> > | &1(k) | 4 (5) nin?» e2, 
k=0 2 


1973] MATHEMATICAL NOTES 669 


Thus, noting that q — 2 <0, we may, by choosing y sufficiently small (and é = $, 
say), make the last term exceed n; hence g ¢ E?. 

It remains to show the existence of a polynomial h satisfying the required con- 
ditions. Choose M and 6, 0<6 <1, so that (”<y, €<6™” <1. By taking a 
suitable partial sum ¢,(z) of the Taylor series expansion of 


62) = Fe (lal = 5). 


one can get d< | dx(z) | <1 for | z| = 1, Consider 
M-1 ; 
nz) = TT due) (A = 20). 
j= 


Then € <6” <|h(z)| <1, for|z| = 1. 

The Taylor coefficients of h(z) are (either zero or) products of M Taylor co- 
efficients of $,(z), but since the maximum modulus of the Taylor coefficients of 
(z) (hence of $y(z) also) is ¥, we have | A(k)| S< (3)% <M for all k. Thus h(z) 
satisfies the required conditions and we are done. 

The proof which has been presented also establishes the following: 


CoROLLARY. In the Banach space C(T) of all continuous functions on the circle T 
(with uniform norm), except perhaps those of a meager subset, each function has a 
sequence of Fourier coefficients that belongs to IZ) for no value of q <2. 


Like many problems in analysis, this is a problem of comparison of norms. 
If a family of new ‘‘norms’’ are lower semi-continuous with respect to the original 
Banach space norm, then the existence of one “‘counterexample’’ implies that the 
collection of all ‘‘counterexamples’’ fills up the space to within a meager subset. 
We leave it as an exercise for the reader to investigate the boundary cases of the 
M. Riesz conjugate function theorem [1, p. 68] and the Bernstein theorem on absolute 
convergence of Fourier series [1, p. 32]. 

In conclusion we mention an open problem: Give an example (or show the 
existence) of a subset E of non-negative integers so that whenever >,.,f(k)z* 
is the Taylor series of a function f in HC(D), then 


Lf)? <co for all p>1, 
keE 
buf there isa Taylor series L,. ¢ &(k)z* of such a function g with &,.,|&(k)| = 0. 


Acknowledgement. The author wishes to thank Professor B. Epstein for his help in preparation 
of the manuscript. 


Reference 


1. Y. Katznelson, An Introduction to Harmonic Analysis, Wiley, New York, 1968. 


670 J. V. BRAWLEY AND L. CARLITZ [June-July 
A CHARACTERIZATION OF THE rn xn MATRICES OVER A FINITE FIELD 
J. V. BRAWLEY, Clemson University, and L. Carurrz, Duke University 


If R is a commutative ring with identity, then it is known [4, p. 507] that every 
function f: R > R is representable as a polynomial if and only if R is a finite field. 
The theorems below are a generalization of that result to arbitrary rings with or 
without an identity. 

Consider an expression in an indeterminate x of one of the following types: 


aX a,x"? ++ ax” *Ana 1, 
a,x"'a,x"? ++» a,x", 
(1) f(x) = x"tayx" ay +++ x"*a,, 


xt ax! a, +++ a,x" *, 


x", 
where the elements a; are in R, the integers n; are positive, and k = 1 is finite but 
arbitrary. In each of these expressions it is meaningful to speak of the function 
f: R + R defined from f(x) by substitution. Likewise if p(x) is a form of the type 


(2) P(X) = ag + f(x) + + h(x), 

where each f(x) is of the type (1), agER, and t 2 0, it is clear that substitution 
in (2) defines a function p from R to R. An expression of the form (2) will be called 
a polynomial over R (this definition of polynomial is more general than that which 
is often considered, e.g. [3] p. 99), and P(R) will denote all those functions in 
R* which can be represeated by polynomials (upon substitution). We seek necessary 
and sufficient conditions on R in order that every function from R to R be represent- 
able as a polynoznial; that is, that R® = P(R). 


LEMMA 1. If R® = P(R), then R is finite. 


Proof. Suppose, to the contrary, that card R = | R| = 00. Then the number of 
expressions of the form (2) has cardinality | R| so that| R| = | P(R)| =|R*| = [R Ir, 
This is a contradiction as | R|!*! > | RI. 


LEMMA 2. If R *= P(R), then the only ideals of R are (0) and R. 


Proof. Suppose there exists an ideal J in R withhOGIGR. Let wp: R- RII 
be the natural homomorphism, p(r) = F=r+I1, and select aeI, a #0 and 
b¢éI. Since R® = P(R) there exists a polynomial of the form (2) representing the 
function 
b;r=a 


p(r) = | 
O; ra. 


1973] MATHEMATICAL NOTES 671 


Let p(x) be the polynomial over 'R/I obtained from (2) by replacing each a,;ER 
by da;. Clearly p(7) = p(r) as 44 is a homomorphism. Now p(0) = p(d) as d= 0. 
But A(6) = p(0) = 0 and A(d) = p(a) = 6 #0, which is a contradiction. 

We can now prove 


THEOREM 1. Let R be a ring. Then R® = P(R) if and only if R is either the 
trivial ring of order 1 or2(ab =0Va,beR) or for some n and some finite field F, 
R = (F),, the n x n matrices over F. 


Proof. Suppose that R* = P(R). From the above lemmas, R is finite and has 
only trivial ideals. Assume first that R is a trivial ring. Then (R, +) has no proper 
subgroups as every subgroup is an ideal. This means that R has order 1 or p, p 
prime [3, p. 42, Ex. 3.29]. Because R is a trivial ring it is clear that polynomials of 
the form p(x) = dp + a,x, where ay, a, €{0,1,---,p—1} must represent all the func- 
tions in R*. If |R| = p, the number of such polynomials is p* so that p? = p? 
or p = 2. If R is not the trivial ring then since R is a finite simple ring we have by 
the Wedderburn-Artin theorem [see 2, p. 48] and the Wedderburn theorem on 
finite division rings [see 2, p. 70] that R = (F),, for some finite field F and integer n. 

Conversely, if R is a trivial ring of order 1 or 2, it is easy to see that every function 
in R® is a polynomial; hence, assume R = (F), for some n = 1 and finite field F. 
To complete the proof we shall use the following lemma: 


LEMMA 3. Let R be a finite ring with identity. Then R® = P(R) iff the function 
1; r=0 


(3) m= tg 


is representable by a polynomial p(x). 


Proof of Lemma 3. If R® = P(R), then of course, p of (3) is representable by 
a polynomial. On the other hand, if p(x) represents p and if fis an arbitrary function 
from R to R, then 


f(x) = 2X f()px —r) 
reR 
is a polynomial representation of f. 
Returning to the proof of the theorem we need only to show that if R = (F), 
then the function (3) is representable by a polynomial of the type 


é 


F(X) = Ay + E F(), 


where the polynomials f,{X) are of the type A,XA,---A,XA,,, (k 2 1), with 
A,é(F), and with X = (x,;) denoting an n xX n matrix in indeterminates x,;. Note 
since (F), has an identity (denoted by I) only one of the forms of (1) is necessary. 

By a result of Dickson [1, p. 124], there exists a polynomial f(x) in the n” var- 


672 J. V. BRAWLEY AND L. CARLITZ [June-July 


ables X = (% 44500 Xan X249 0%» X2n0 0's Xnto 79 Xan) Which upon substitution of val- 
ues from F, takes on the values 0 for x = 0 and 1 for x # 0. Denoting by E;; the 
n x n elementary matrix over F with a 1 in position (i,j) and zeros elsewhere, it is 
easily seen that E,,XEj, =x,E,. Setting X() = E,XE,, and £© =(X}},X 73, Xa), 


we find that f( *%) = f(X) is a polynomial in X and f(X) =f(xX)E,,. Thus 
AX) =I - 2 A(X) = [ — diag( f(x), f(X), --.f(X)) 


is a polynomial in X taking on the value J for X = 0 and 0 for all X 4 0. Thus 
by Lemma 3 the proof is complete. 

Since (F) is commutative iff n = 1 we have the following corollary which in- 
cludes the result mentioned in the first paragraph. 


CoroLLary. Let R bz a coninutative ring. Then R® = P(R) iff either R is a 
trivial ring of order 1 or 2, or R is a finite field. 


By restricting the notion of polynomials we obtain as a final result: 


THEOREM 2. Let R be a ring with 1 and let L[x]| and R[x] denote respectively 
the left and right polynomials over R; that is, L[x] = {ay + xa, + -+- + x"a,:a,ER}, 
R[x] = {@o + a,x +-++ +.,x": a,;ER}. Let P,(R) and P,(R) denote those func- 
tions in R® representable by left and right polynomials, respectively. Then 
P,(R) = R® iff P,(R) = R® iff R is a finite field. 


Proof. Suppose P,(R) = R*®. Then R is finite by an argument similar to that 
of Lemma 1. Let be R be an arbitrary nonzero element and let p(x) = ag + Xa, +°" 
+ x"a, be the left polynomial representing the function p: R > R defined by p(b) = 1, 
p(r) = 0 for r 4 b. Since p(0) = 0 it follows that a) = 0 and since p(b) =1 
= b(a; +--+ b"~"a,) it follows that b has a right inverse, i.e., R — {0} is a group. 
Thus R is a finite division ring which by Wedderburn’s theorem is a field. A similar 
proof holds if P,(R) = R*. 

If R is a finite field it is well known that R* = P,(R) = P,(R) and the proof 
is complete. 


L. Carlitz’s work was supported in part by NSF grant GP-17031. 


References 


1. L. E. Dickson, General theory of modular invariants, Trans. Amer. Math. Soc., 10 (1999) 
123-158. 

2. I. N. Herstein, Noncommutative Rings, Carus Mathematical Monograph, No. 15, M.A.A. 
(1968). 

3. N. Jacobson, Lectures in Abstract Algebra, Vol. 1, Van Nostrand, Princeton, N.J., 1953. 

4. L. Rédei, Algebra, Vol. 1, Pergamon Press, Oxford, 1967. 

5. J.J. Rotman, The Theory of Groups: An Introduction, Allyn and Bacon, Boston, 1965. 


1973] MATHEMATICAL NOTES 673 
ANOTHER PROOF OF BERNSTEIN’S THEOREM 
P. J. O’Hara, Florida Technological University 


If P is a complex polynomial! put 


| P IR = max | P(z) . 
jz[=R 
A well-known theorem states that if P is of degree n then 


| P’ 


n 
r Selle. 


In the literature (see [1] or [2]) this result is usually derived from the equivalent 
theorem by S. N. Bernstein [3] for trigonometric polynomials on the real line. The 
proof for the trigonometric case usually involves some analysis such as Rolle’s 
theorem. In this note an apparently new proof is given which in essence involves no 
analysis. The proof depends on the following lemma which is of interest in itself. 


LemMA. If P is any complex polynomial of degree S n and 2z,,::-,z, are the 
zeros of z" + 1, then for every complex number t, 


n 


1 2 
tP'(t) = = Pt) + -_ 2 P(tz,) 7k 


(z,— 1)? | 
Proof. For each complex number ¢ define the function Q, by: 
P(tz) — P(t) 

z-1 © 


Q(z) = 


Q, is a polynomial of degree <n —1, and therefore by using the Lagrange in- 
terpolation formula [4] with z,,:--,z, as interpolation nodes we can write: 


. z™ +1 1 z+ 1 
Q(z) = 2 QZ) nzi-\z—a,) W 2 QZ) tes 
In obtaining this last equation we have used the fact that zj~*'= — 1/z,. Now 
since Q,(1) = tP’(t), we obtain the following identity in t: 
1 22, — 22, 
’ = — = P(tz,) — P(t)]| ——— 
tP’(t) x Q,(z,) a nr 2 [P(tz,) — P(d)] (z,— 1)? 
. 22 P(t) x QZ 
1 = — YD P(tz,)..—*; -—+~ © -“. 
U) 1 (24) (z, — 1)? n 1 (z, — 1)? 
Applying (1) with P(t) = t” establishes that 
1 m QZ, n 


(2) ty eat 


674 W. B. GORDON [June-July 


Clearly (1) and (2) establish the lemma. 


In order to prove Bernstein’s theorem we first apply the lemma to obtain that 
if |¢| = R then 


n 1 n 
(3) R|P(t)| < S+s E 


2Z% 


(z, — 1)? 


|? ls 


Now if | z| = 1 and z #1 then 2z/(z — 1)” is a negative real number. In fact it is 
easy to show that 


2e!° 1 


Bernstein’s theorem follows easily now from (2) and (3). 

The lemma and other similarly derived identities can be used to obtain many 
other interesting results. For example the following theorem for which the author 
has no reference is an immediate consequence of the above discussion. 


THEOREM. If P is a complex polynomial of degree n, P(ty) =0, and | to| = R then 


P'(to)| S a P le 


References 


1. I. P. Natanson, Constructive Function Theory, vol. 1, Ungar, New York, 1964, p. 90 (Trans- 
lated from the Russian.) 


2. G. G. Lorentz, Approximation of Functions, Holt, Rinehart, and Winston, New York, 1966, 
p. 39. 


3. S. N. Bernstein, Sur l’ordre de la meilleure approximation des fonctions continues par des 
polynémes de degré donné, Mémoires de |’ Académie Royale de Belgique, 2 (1912), vol. 4, pp. 1-104. 
4. F. B. Hildebrand, Introduction to Numerical Analysis, McGraw-Hill, New York, 1956, p. 62. 


ADDENDUM TO “ON THE DIFFEOMORPHISMS OF EUCLIDEAN SPACE”’ 
W. B. GorDON, Naval Research Laboratory, Washington 


A recent note [1] in this MONTHLY was concerned with the following theorem, 
which is a global version of the standard Inverse Function Theorem. 


THEOREM. Let M, and M, be connected, oriented manifolds of class C', with- 
out boundary, and suppose that M, is simply connected. Then a C' map f from 


M, to M, is a diffeomorphism if and only if f is proper and the Jacobian of f never 
vanishes. 


In particular, a C' map f from R* to R™ is one-one and onto if 


(i) |x| > implies | f(x)| > 0, and 
(11) The Jacobian of f never vanishes. 


1973} RESEARCH PROBLEMS 675 


In this note I observed that although this theorem was known to Hadamard 
(1905), it does not adpear to be ‘‘well-known’’ at the present time, and soon after 
the note went to rress I learned that the theorem was rediscovered by R. S. Palais 
in the course ot developing propositions of a more general character. See [2, p. 
128-129]. 

Professor Palais also suggests the following simplifications to the proof: In my 
note I appealed to the standard results of degree theory to establish that f is onto; 
but this is not necessary, for one can easily show that a proper map between mani- 
folds sends closed sets into closed sets. (For a more general statement of this fact 
see [3].) But the non-vanishing of the Jacobian implies that f is a local homeomor- 
phism and therefore sends open sets into open sets. Hence f(M,) is both an open 
and closed (nonempty) subset of M,; 1.e., f(M,) = M,). 


References 


1. W. B. Gordon, On the diffeomorphisms of euclidean space, this MONTHLY, 79 (1972) 755-759. 
2. R. S. Palais, Natural operations on differential forms, Trans. Amer. Math. Soc., 92 (1959) 
125-141. 


3. , When proper maps are closed, Proc. Amer. Math. Soc., 24 (1970) 835-836. 


RESEARCH PROBLEMS 
Epiten By RICHARD Guy 


In this Department the Monthly presents easily stated research problems dealing with notions 
ordinarily encountered in undergraduate mathematics. Each problem should be accompanied by 
relevant references (if any are known to the author) and by a brief description of known partial 
results. Manuscripts should be sent to Richard Guy, Department of Mathematics, Statistics, and 
Computing Science, The University of Calgary, Calgary 44, Alberta, Canada. 


HOW UNEXPECTED IS THE PRIME NUMBER THEOREM? 
M.D. HirscHHorN, University of New South Wales, Australia 


Let p,,-::, p, be the first n primes. Let x be chosen randomly from among the in- 
tegers greater than p,,. The probability that x is divisible by p, is 1/p;, so the probab- 
ility that x is divisible by none of p,,-:--, p, is 


(5) (5): 


If we make the unjustified assumption that the property of being divisible by 
none of p,,...,p, iS held randomly by numbers greater than p, with probability 
(1 — 1/p,)---(1 — 1/p,), then the probability that 


676 M. D. HIRSCHHORN [June-July 


Pnu+1 — Pa =F 


(AoA d) 


So the expected value of p,., — p, is 
00 r-1 
E r(1-(1-—)~(1-—)] (1-—)(1-—) 
r=1 Pi Dr Pi Pr 
- 1 [(t-+)--(1-<). 
P1 Pn 


Accordingly, we define the series of problimes by 


1 1 
= 2, Qn+1 = .+1/(t- 2) (1-2). 
q1 Qn+1 q qt q, 


The interesting question that arises is, what is the asymptotic behavior of the q,,? 
In particular, is it true that 


gq, ~ nlog,n? 


If this is true, then perhaps the prime number theorem is a little less surprising, 
since it is a consequence of the prime number theorem that 


Pp, ~ niog.n. 


I have been able to prove that 
(1) qg,/n is unbounded, 
(2) q,/ni*® + Oas n > oo for any fixed e > 0, 
but I have not been able to prove that q,/n is eventually monotonic increasing. 
It is clear that the problimes soon become non-integral. It may be more attractive 
to study the various integral sequences defined by 


1 1 
2s aaer ace P(i/(1—4)~(1-4)) 
q1 Qn+1 q a1 | q, 


where 


F(x) = [x], the greatest integer not greater than x, 
F(x) = <x», the closest integer to x, or 
F(x) = {x}, the least integer not less than x. 


1973] CLASSROOM NOTES 677 


The first few terms of each of these sequences are given in the following table, to- 
gether with the primes for comparison: 


n 1 2 3 4 5 6 7 8 9 10 11 
Pn 2 3 5 7 11 13 17 19 23 29 31 

qe 20 40 67 #98 133 171 211 25.3 29.7 34.2 38.9 
Ans] 2 4 6 9 12 15 19 23 27 31 35 
In) 2 4 7 10 13 17-21 25 29 #42934 + °# 39 
Inst} 2 4 7 11 15 19 23 28 33 38 43 

n 12 13 14 15 16 17 18 19 20 21 
Pn 37 41 43 47 53 59 61 67 71 73 
Qn 43.7 486 53.6 58.7 63.9 69.2 746 80.0 85.5 91.1 
Ant] 40 45 50 55 60 65 70 75 80 86 
ani) 44 49 54 59 64 69 74 79 84 90 
ans} 48 53 58 63 68 73 79 85 91 97 


* to 1 decimal place 


P. Erdés (written communication) believes that he can prove by Tauberian arguments that 
(Qn+1—%n) | logen 1, and hence that q,/n log. — 1. 


I am indebted to the referee and to Richard K. Guy for their helpful comments and suggestions. 


CLASSROOM NOTES 
EDITED BY ROBERT GILMER 


Material for this Department should be sent to David Roselle, Department of Mathematics, 
Louisiana State University, Baton Rouge, LA 70803. 


THE INDECOMPOSABILITY OF THE DYADIC SOLENOID 
S. B. NApLER, Jr., University of North Carolina at Charlotte 


In a beginning topology course students are sometimes confronted with the 
notion of ‘‘indecomposability’’ for continua (a continuum is indecomposable [4, 
p. 139] provided it cannot be written as the union of two proper subcontinua). 
There are proofs in the literature (see, for example, [3], [5], and material related 


678 S. B. NADLER [June-July 


to [5]) that certain continua are indecomposable, but these proofs are ‘‘compli- 
cated’’ and, for the most part, inaccessible to a beginning student. Intuitive argu- 
ments for the indecomposability of certain continua, based on thinking of them 
as special nested intersections, are often dissatisfying to students because some of 
the details of such arguments are difficult to write down. 

The purpose of this note is to give (in detail) a short and direct proof of the 
well-known and often-quoted result (see, for example, [4, p. 145]) that the dyadic 
solenoid is indecomposable. The proof is the simplest complete proof of which the 
author is aware, showing the indecomposability of any specific continuum. It uses 
only the very basic properties of inverse limits, a construction which appears in 
many standard texts ({2], [4], and others). These properties all appear in [1], where 
it is shown that they are a consequence of translating, to the framework of inverse 
limits, such standard results as (for example) the intersection of a decreasing sequence 
of metric continua is a metric continuum, etc. The crux of the proof is a pigeonhole- 
type argument. 

Let X denote the inverse limit of the inverse sequence {C,, f;};2, where C;, is the 
unit circle in the plane and f,: C;,,— C; is given by f,(z) = z* for each i = 1,2,--- 
(X is commonly called the dyadic solenoid). In what follows we let 2;: X > C; 
be the projection mapping (restricted to X). 


THEOREM. X is indecomposable. 


Proof. Let A and B be subcontinua of X such that d UB = X. It suffices to 
show that A c Bor Bc A. Let i bea given natural number and suppose 7,(A) € 7,(B) 
and 7,(B) ¢ 2,(A). Choose aeé2,A) — 7,(B) and be7z,(B) —7;,(A), let x and y be 
the two square roots of a, and let z and w be the two square roots of b. Since the 
bonding maps are all onto, the projection 2;,, maps X onto C;,, (2.6 of [1]). 
Thus, since AUB= X, 


X ET 41(A) U m;4 1(B). 
If xez;,,(B), then 
a = fix) efi(mi+1(B)) = 2,(B) 


(2.1 of [1]), a contradiction. Therefore, x €7;4,(A) — 7;4,(B) and the same is true 
of y. Similarly, z and w each belong to 7;,,(B) — 7,4 ,(A). Since A is connected 
and x and y are diametrical members of 7;,,(A), 7;4,(A) must contain at least 
one of the two semicircles determined by x and y. But then z and w being diametrical 
implies z or w is in this semicircle, hence in z;,,(A), a contradiction. This proves 


(1) 7(A) <7,(B) or (2) 2({B)<7(A). 


Thus, (1) holds for infinitely many natural numbers or (2) holds for infinitely many 


1973] CLASSROOM NOTES 679 


natural numbers. This implies, by 2.1 of [1], that (1) holds for all natural numbers 
or (2) holds for all natural numbers. Now, since A is equal to the inverse limit of 


{7,(A), fi | Ti+ (A) SP 4 


and B is equal to the inverse limit of 
(ni(B), fi | T+ 1(B)} a1 
(see 2.8 of [1]), it follows that Ac Bor BcA. 


References 


1. C. E. Capel, Inverse limit spaces, Duke Math. J., 21(1954) 233-245. 

2. James Dugundji, Topology, Allen and Bacon, Boston, Mass., 1966. 

3. A. van Heemert, Topologische Gruppen und unzerlegbare Kontinua, Compositio Math., 
5(1937) 319-326. 

4. John G. Hocking and Gail S. Young, Topology, Addison-Wesley, Reading, Mass., 1961. 

5. W. T. Ingram, Concerning nonplanar circle-like continua, Canadian J. Math., 19 (1967) 
242-250. 


THE DIFFERENTIABILITY PROPERTIES OF TYPICAL FUNCTIONS IN C fa, 6] 
A. M. BRUCKNER, University of California, Santa Barbara 


1. Introduction. Most students who complete an undergraduate program in 
mathematics encounter in some course an example of a continuous nowhere dif- 
ferentiable function. They may or may not be surprised to learn of the existence of 
such functions, but they are generally surprised that such functions represent the rule, 
rather than the exception. In saying that continuous nowhere differentiable functions 
represent the rule we mean that if C[a,b]| represents, as usual, the set of continuous 
real valued functions defined on the closed interval [a, b] furnished with the metric 
of uniform convergence, then the subset of nowhere differentiable functions is residual 
in the complete metric space C[a, b]. This means that the complement of this subset 
is a set of the first category and is therefore negligible in the sense of category. In 
later courses, the student comes across other types of ‘‘pathological’’ continuous 
functions and learns that they, too, are typical, rather than exceptional. In any case, 
the student might complete his mathematical education with a feeling that typical 
continuous functions behave very irregularly with respect to differentiation. And he 
may wonder — ‘‘what is the typical continuous function like when it comes to 
differentiation properties?’’ The purpose of this note is to provide an answer to this 
question. We shall see that the typical continuous function is indeed very pathological. 
But, by taking a different perspective, we shall see that one can also come to the 
opposite conclusion. 

When we say a ‘‘typical’’ continuous function has a certain property, we shall 


680 A. M. BRUCKNER [June-July 


mean that the set of continuous functions with this property is residual in C[a, b]. 
We mention that none of the material we present is new. It can all be found in the 
mathematical literature, but only parts of what we present have found their way 
into the textbooks. 


2. Nowhere differentiable functions. The first example of a continuous nowhere 
differentiable function is widely assumed to be due to Weierstrass (about 1875), 
although there appears to be some evidence suggesting that Bolzano had con- 
structed such a function considerably earlier. More recently, a number of authors 
have constructed simpler examples of such functions. See for example Mikolas [8] 
for a general method of constructing continuous nowhere differentiable functions. 
It wasn’t until 1931 that the existence of such functions was proved by the use of 
the Baire Category Theorem (see Banach [1] and Mazurkiewicz [7]). Now to say 
that a function is nowhere differentiable is to say that at no point does it have a 
finite (two-sided) derivative. What happens if we allow derivatives to be infinite? 
And what happens if we allow derivatives to be only one-sided? An inspection of 
the standard category argument shows that one can, by perhaps modifying the 
arguments a bit, allow either of these relaxations in the definition of the derivative 
and still conclude that a typical continuous function is nowhere differentiable. 
But one can’t allow both relaxations simultaneously without losing the result! 
Thus, we can Say that a typical continuous function has at no point a finite or infinite 
two-sided derivative, nor a finite one-sided derivative, but does have an infinite 
one-sided derivative at each point of some nonempty set S. Actually, it was shown 
by Saks [10] that this set S is nondenumerable. So we can’t use a category argument 
to determine whether or not there exists a continuous function which fails at every 
point to possess a one-sided finite or infinite derivative. The question of the existence 
of such functions remained unanswered until Besicovitch constructed such a function 
in 1925 [2]. A construction can be found in Jeffery [5; p. 172]. The construction is 
geometrical, but complicated. An arithmetical construction can be found in Morse [9]. 
Such functions are not typical, however, as Saks’ result shows. 

A quick word about the category proofs. The standard proofs are relatively 
straightforward. One shows directly that the class of continuous functions which 
possess at even one point a derivative can be written as a countable union of nowhere 
dense sets. For example, if E, denotes the set of functions f in C[0,1] such that 
for some x in [0,1 —1/n] the inequality | f(x + h) — f(x)| < nh holds for all h 
satisfying 0<h<1-—vx, then it is not didicult to verify that each E, is nowhere 
dense by verifying that it is closed and has a dense complement. It is also easy to see 
that if f has a finite right derivative at some point, then fe ;., E,. Thus the class 
of functions having finite right derivatives in a nonempty set is of the first category 
in C[a,b]. On the other hand, to show that the class of functions having infinite 
right-hand derivatives on a nonempty set is residual in C[.a,b] is harder to show. 
We know of no straightforward proof of this fact. What Saks [10] showed is that 


1973] CLASSROOM NOTES 681 


this set is of the second category in every sphere of Cla, b]. Now, it is possible for 
a set to be of the second category in every sphere without this set being residual. 
But this is not possible if the set has the property of Baire. Thus, by showing that 
the set of functions with a finite or infinite derivative on a nondenumerable set is 
an analytic set in C[a,b], and therefore has the property of Baire, Saks was able to 
complete the proof that this set is residual in CLa, b]. 


3. The functions of Marcinkiewicz and Jarnik. So far we have discussed some of 
the better known pathologies possessed by typical continuous functions. We turn now 
to two other kinds of typical continuous functions which suggest even more pathology. 

In 1935 Marcinkiewicz [6] proved that there exists a continuous function ® 
with the property that to every measurable function f there corresponds a sub- 
sequence {h,} of the sequence {1/n} such that 
(1) lim Pee Be) = O09 = f(x) almost everywhere. 

k-> 0 k 

Thus ® is a (generalized) antiderivative for every measurable function f. As one 
might expect, ® is typical — the class of all functions with this property is residual 
in C[a,b]. If we consider only the case where f is a constant, f = A, we see that 
the function ® has the property that every real number A is a derived number of ® 
at almost every point; in fact, not just a derived number, but a derived number 
relative to a fixed sequence {h,}. This result shows that while a typical function in 
C[a,b] is very wildly behaved, it does possess a certain type of regularity — for 
each real number J, its behavior near any point outside some null set is similar 
to its behavior near any other point relative to the sequence {h,}. 

By considering only constant functions, f, we have given up a good deal of 
Marcinkiewicz’s result. But we shall gain in another direction. Let us write the 
expression (1) in a somewhat different form, at the same time requiring that f = 2: 


(2) lim Cy) = BO) = i almost everywhere. 
yrx yorx 
ye {xthy} 


This displays the fact that yx while restricted to the set {x + h,}. 

Can we fatten up the set {x + h,} to some larger set E(x) so that (2) remains valid 
whenever y > x, y € E(x)? Jarnik [4] has shown that there exist continuous functions 
® with the property that for almost every x and for every real number / there exists 
a set E(x) having right upper density 1 at x such that 


lim 20) = 2@) _ 


yrx yx 
y € E(x) 


d. 


The statement that E(x) has right upper density | at x means that 


682 A. M. BRUCKNER [June—July 


lin h”! w(E(x) A[x,x + A]) = 1. 

h-0 
Thus E(x) is ‘‘fat’’ near x, relative to some net of intervals [x,x + h] contracting 
to x. We say that / is an essential derived number of ® at x. 

Once again, the class of functions with this property is residual] in C[a, b]. Thus 
the typical continuous function has every real number as an essential derived number 
at almost every point. What could be more regular than that? 

We close with three remarks. 


REMARK 1. It follows from Lusin’s theorem that any measurable function is 
the almost everywhere limit of a sequence of continuous functions. Any Marcinkiewicz 
type function can be used to give rise to such a sequence. Thus if f is measurable 
and if ® is a Marcinkiewicz type function for which (1) holds, then ®,(x) > f(x) 
almost everywhere, where 

D(x + h,) — O(x) 


D,(x) = ho . 


Each function ®, is clearly continuous. 


REMARK 2. Every continuous function is differentiable when restricted to a 
suitable subset of [a, b]. In fact, it was shown [3] that if f is defined and continuous 
on a set containing a nonempty perfect set P, there exists a nonempty perfect set 
Q < P such that the restriction of f to Q is infinitely differentiable. (Here we allow 
the possibility that the derivatives take infinite values.) 


REMARK 3. We can sum up by saying that the typical continuous function has 
at no point a finite or infinite derivative or a finite one-sided derivative. It has infinite 
one-sided derivatives on a nondenumerable set. Every real number J is an essential 


derived number a.e. 


The author was supported by NSF grant GP 18968. 


References 


1. S. Banach, Uber die Baire’sche Kategorie gewisser Funktionenmengen, Studia Math., 3 
(1931) 174-179. 

2. A. S. Besicovitch, Diskussion der stetigen Funktionen im Zusammenhang mit der Frage 
liber ihre Differentierbarkeit, Bull. Acad. Sci. de Russie, 19 (1925) 527-540. 

3. A. M. Bruckner, J. G. Ceder and M. L. Weiss, On the differentiability structure of real func- 
tions, Trans. Amer. Math. Soc. 142 (1969) 1-13. 

4. V. Jarnik, Sur les nombres dérivés approximatifs, Fund. Math., 22 (1934) 4-16. 

5. R. Jeffery, The Theorem of Functions of a Real Variable, Math. Expositions No. 6, University 
of Toronto Press, 1969. 

6. J. Marcinkiewicz, Sur les nombres dérivés, Fund. Math., 24 (1935) 305-308. 

7. S. Mazurkiewicz, Sur les fonctions non dérivables, Studia Math., 3 (1931) 92-94. 


1973] CLASSROOM NOTES 683 


8. M. Mikolas, Construction des familles de fonctions partout continues non dérivables, Acta. 


Sci. Math. Sz2ged, 17 (1956) 49-62. 
9, A. P. Morse, A continuous function with no unilateral derivatives, Trans. Amer. Math. Soc., 


44 (1938) 496-507. 
10. S. Saks, On the functions of Besicovitch in the space of continuous functions, Fund. Math., 


19 (1932) 211-219. 


REPRESENTING A FINITE BOREL MEASURE IN 
TERMS OF ITS DISTRIBUTION FUNCTION 


J. J. HicGins, University of Missouri-Rolla 


A real-valued function F defined on (— 00, 00) which is bounded, non-decreasing, 
continuous from the right, and which satisfies the condition that F(x) 0 as x > ~ oo 
is called a distribution function. Associated with a finite Borel measure p defined 
on the Borel subsets of (— 00, 0) is the distribution function F(x) = p[(—0o,x]]. 
We consider the problem of giving an explicit expression for yp in terms of its distri- 
bution function F and Lebesgue measure 2. When yw has a continuous distribution 
function F, we see immediately that p[(a,b]] = AF[(a,b]]) for all a<b. Will 
the equation (A) = A[F(A)] hold for all Borel sets A? The answer is yes, provided 
F is continuous. Rather than proving this result directly, we prove a somewhat 
more general result to take into account the possibility that F has discontinuities. 
The result seems to give a clear picture of the way in which Borel measures are 
constructed. 

We first establish two lemmas. 


LEMMA 1. Let f:(— 00,0) + (— 0,00) be a non-decreasing function. For 
each real number y, let C, = {x: f(x) = y}, and let D = {y: C, contains more 
than one point}. Then D is a countable set. Furthermore, if A and B are disjoint 
sets, then f(A) A f(B) is a countable set. 


Proof. As a consequence of the monotonicity of f, if C, contains more than 
one point, then C, contains an open interval. Thus each member of the collection 
iCy}yep Contains a rational number r,;,the numbers r, are distinct since the sets 
of the collection are pairwise disjoint. It follows that D is a countable set. If A and 
B are disjoint sets, then f(A) ON f(B) < D; consequently, f(A) Af(B) is a countable 
set. 


LEMMA 2. If F is a distribution function, then A[F(-)] is a finite Borel 
measure. 


Proof. First we show that F(A) is a measurable set for all Borel sets A. To this 


end, consider the collection Q = {A: F(A) is a measurable set}. Let x,,x,,--- denote 
the discontinuities of F. Let E, equal either (F(x; ), F(x,)) or LF(x; ), F(x,)), depend- 


684 J. J. HIGGINS [June—July 


ing on whether F(x; ) is in the range of F ot not. Then 
F([a,b]) = [F(a), F(b)] \ U &,. 
i=1 


It follows that Q contains all intervals. Clearly Q contains all countable unions 
of its members. From the equality 


F(A) = [(F(A))° 9 F(— co, «))] U [F(A) 9 F(A)] 


and since F(A) Q F(A°) is a countable set, it follows that Q contains all complements 
of its members. Thus Q is a sigma-algebra of subsets of (— 00, 00) containing all 
intervals, and therefore contains all Borel sets. 

From what has been shown, we see that A[ F(- )] is a nonnegative Borel-set func- 
tion which assigns value zero to the empty set, leaving only the countable additivity 
to be established. Sappos2 {A;}*., is a sequence of pairwise disjoint Borel sets. 
The countable additivity of AL F(-)] follows from Lemma 1 which gives us the fact 
that the sets of the sequence {F(A,)};2, are pairwise disjoint except on a set of 
Lebesgue measure zero. It follows that A[F(-)] is a finite Borel measure. 


THEOREM. Let p be a finite Borel measure whose distribution function is F. 
Let x1,X2,°:: denote the discontinuities of F, and let j(x;) = F(x; — F(x; ). Let 
F(x) = F(x) — Ly <xj(xi). Then for all Borel sets A, 


(1) H(A) = ALF(A)] + & f(x). 
x,€ 

Proof. It is well known that F, is a continuous distribution function (see [1] 
Chapter 1). The right-hand side of (1) is therefore a Borel measure, being the sum 
of two such measures, which agr22s with uw on the intervals. The equality for all 
Borel sets follows from the familiar uniqueness theorem for measures (see [1], 
Theorem 2.2.3). 

From this theorem, we see that once the properties of Lebesgue measure are 
established and once some of the properties of a distribution function are estab- 
lished, the construction of an arbitrary finite Borel measure is immediate. (For 
an alternate approach to the problem of constructing Borel measures, see [2], page 
261.) Furthermore, we have F, = F,,+F,, where F,, is an absolutely continuous 
distribution function and F, is a continuous, singular distribution function (see [1]), 
chapter 1). From this we have 


FCA] =f Peeoddx + ALFA], 


The term A[F,(A)] + 2X, -4i(x,) is the singular part of the Lebesgue decomposi- 
tion of u(A), and [4 Fi,(c)dx is the absolutely continuous part. 


1973] MATHEMATICAL EDUCATION 685 


References 


1. Kai Lai Chung, A Course In Probability Theory, Harcourt, Brace and World, New York, 


1968. 
2. H. L. Royden, Real Analysis, second edition, Macmillan, New York, 1968. 


MATHEMATICAL EDUCATION 
EpDITED BY J. G. HARVEY AND M. W. POWNALL 


Material for this Department should be sent to Shirley Hill, Department of Mathematics, 
University of Missouri, Kansas City, MO 64110, or to Paul Mielke, Department of Mathe- 
matics, Wabash College, Crawfordsville, IN 47933. 


USING STUDENT-TUTORS IN PRECALCULUS INSTRUCTION 


T. A. EISENBERG and J. B. Browne, Northern Michigan University 


The purpose of this paper is to discuss a method of instruction which seems 
to be effective in handling courses with large enrollments which are primarily aimed 
at skill development. 

The catalog description of the course we are describing is: Math 100 Basic 


MATHEMATICS 4 credits 


A study of the fundamental operations of algebra, factoring, graphing, linear equations, and an 
introduction to exponents, radicals, and quadratic equations. 


The text used [1] contains most topics found in an elementary algebra course offered 
in high school. The course usually has an enrollment of 200 to 250 students. 

We take the position that the purpose of a substantial group of courses offered 
by a mathematics department under the rubric of precalculus have as their main 
objective—skill development. The purpose of these courses is to eradicate a student’s 
deficiencies and to enable him to perform a predetermined set of desired skills. 
Such precalculus courses are not ends within themselves, but are designed to prepare 
a student to handle whatever mathematics he will encounter in his other fields of 
endeavor. Survey, appreciation, and teacher education courses in mathematics 
which do not have at their core skill development, belong to a different group of 
precalculus offerings. 

Because of a high failure rate in Math 100 we decided to change the classroom 
format of the course. Commencing in the fall semester of the ’71-’72 academic 
year, a major switch from a large lecture only format to a lecture recitation format 


PROBLEMS AND SOLUTIONS 


EDITED BY Emory P. STARKE 


ASSOCIATE EDITORS: JOSHUA BARLAZ, Eric S. LANGFORD. COLLABORATING EDITORS: LEONARD 
CARLITZ, GULBANK D. CHAKERIAN, HASKELL COHEN, S. ASHBY Foote, ISRAEL N. HERSTEIN, 
Murray S. KLAMKIN, DANIEL J. KLEITMAN, ROGER C. LYNDON, MARVIN MARCUS, CHRISTO PH 
NEUGEBAUER, ALBERT WILANSKY, AND UNIVERSITY OF MAINE PROBLEMS GROUP: EARL M. L. 
BEARD, GEORGE S. CUNNINGHAM, CLAYTON W. DODGE, OSKAR FEICHTINGER, WILLIAM R. 
GEIGER, RAMESH GUPTA, GARY HAGGARD, PHILIP M. LOCKE, JOHN C. MAIRHUBER, CURTIS 
S. Morse, GRATTAN P. MURPHY, EDWARD S. NORTHAM AND WILLIAM L. SOULE, JR. 


All problems (both elementary and advanced) proposed for inclusion in this Department should 
be sent to E. P. Starke, 1000 Kensington Ave., Plainfield, NJ 07060. Proposers of problems 
are urged to enclose any solutions or information that will assist the editors. Ordinarily, prob- 
lems in well-known textbooks and results in generally accessible sources are not appropriate 
for this Department. No solutions (except those accompanying proposals) should be sent to 


Professor Starke. 
ELEMENTARY PROBLEMS 


Solutions of Elementary Problems should be sent to Problems Group, Mathematics Department, 
University of Maine, Orono, ME 04473. To facilitate their consideration, solutions of Elemen- 
tary Problems in this issue should be typed (with double spacing) and should be mailed before 


September 30, 1973, 
An asterisk (*) means neither the proposer nor the editors supplied a solution. 


E 2420. Proposed by E. T. H. Wang, University of Waterloo, Canada 


Find all triangles with integral sides, each of which has its perimeter numerically 


equal to its area. 


E 2421. Proposed by R. C. Buck, University of Wisconsin 


(i) Show that if G is a group in which the map x > x? isa monomorphism (1:1 


homomorphism, but not necessarily onto), then G is abelian. 


é 


(ii) Exhibit a non-abelian group for which x > x* is an automorphism. 
(iii) Are there examples for every other exponent > 4? 


E 2422*. Proposed by E. M. Reingold, University of Illinois, Urbana 


Two rectangles are incomparable if neither can be placed inside the other when 


they are aligned so that corresponding sides are parallel. Prove or disprove: No 
rectangular region can be tiled with mutually incomparable rectangles. 


E 2423. Proposed by Lyles Hoshek, Monterey Park, California, and B. M. 


Stewart, Michigan State University 


Let there be given a plane convex quadrilateral of area A. Divide each of its four 


691 


692 ELEMENTARY PROBLEMS AND SOLUTIONS [June-July 


sides into n equal segments and join the corresponding points of division of opposite 
sides, forming n? smaller quadrilaterals. Prove: (a) the n smaller quadrilaterals in 
any diagonal (ordinary or broken) have a composite area equal to A/n; (b) The 
composite area of any row of smaller quadrilaterals and its complementary row (row 
i and row n +1 —i) is equal to 24/n. (In particular, if n is odd this implies that 
the composite area of the middle row is A/n.) 


E 2424. Proposed by P. J. Murray, Westminster College and independently by 
E. T. H. Wang, University of Waterloo 


Let S,, denote the set of all permutations of the first n natural numbers. We can 
define a metric on S, as follows: If o, te S,, then d(¢,t) = Li. | a(i) — (i) |. What 
possible numerical values can d assume? 


E 2425. Proposed by C. A. Nicol, University of South Carolina 
For each positive real number x let (x) denote the number of twin primes not 


exceeding x. Show that 


T13(X) = 2 + ; x sin s(n + 2) | sins-n [na 


Snsx 2 n 


where brackets denote the greatest integer function. 


SOLUTIONS OF ELEMENTARY PROBLEMS 


Pell and the Regular 7-Simplex 
E 2294 [1971, 405]. Proposed by Douglas Lind, Stanford University 
For what n does the regular n-simplex of side 1 have rational height? 


Solution by Michael Goldberg, Washington, D.C. The radius r of an inscribed 
hypersphere of an n-dimensional simplex of unit edges is given by r = {1/2n(n + 1)}? 
See H. S. M. Coxeter, Regular Polytopes, table on p. 295. The height is (n + 1)r. 
Hence, the height is rational if r is rational, that is, if 2n (n + 1) = t?. Since n and 
(n + 1) have no factors in common, then either 


n=2x* and (1) 2x74+1=y?, 
or 

n=u? and (2) 2u?—-1 =v’. 
Equations (1) and (2) are Pell equations whose solutions can be obtained by familiar 
methods, e.g. Dickson, Introduction to the Theory of Numbers, p. 114. The least 
positive solution of (1) is (x, y) = (2, 3). Hence the infinite sequence of solutions of (1) 
may be obtained by the generating equation 


(y + ./2x) =(3+2,/2)*, k= 1,2,3,--, 


1973] ELEMENTARY PROBLEMS AND SOLUTIONS 693 


and (x, y) = (2,3), (12,17), (70, 99), etc. Similarly, the infinite sequence of solutions 
(u,v) is obtained as (u,v) = (5,7), (29,41), (169, 239), etc. Both sets of solutions are 
found as rational approximations of /2 which are equal to the fractions y/x and 
v/u. Hence n = 8, 49, 288, 1681, 9800, 57121, etc. 

Also solved by R. T. Bumby, Cal Poly Solutions Group, R. J. Dickson, J. W. Grossman, Heiko 
Harborth (Germany), Ralph Jones, D. C. Kay, Harry Lass & W. S. Sinclair, D. C. B. Marsh, Henrik 


Meyer (Denmark), David Singmaster (England), John Stout, E. W. Trost (Switzerland), University 
of Torcnto Geometry Students, one unidentified solver, and the proposer. 


Two Conditionally Convergent Infinite Series 


E 2361 [1972, 662]. Proposed by Richard Johnsonbaugh, Morehouse College 


Prove that the following series converge conditionally: 


Y (-1)"(n"/"— 1) and LX (— 1)"[e-(1 + 1/n)"]. 
n=1 n=1 

Solution by Charles Chouteau, West Virginia State College. Convergence follows 
since, for n = 3, n‘/" is a monotonic decreasing sequence converging to 1 (consider 
that x’/* = exp[(1/x)Inx]), and (1 + 1/n)" is a sequence of positive terms mono- 
tonically increasing to e. 

To show that ©%,(n!/"— 1) diverges, we use x —12Inx for x>0, with 


equality if and only if x = 1. We have 


5 (n/"—1)2 E Int!" = y (1/n)Inn, 
=1 


n=1 n=1 n= 


which diverges by the integral test. To show that 
x [e—(1 + 1/n)"] 


diverges, it suffices to show that e — (1 + 1/n)" 2 1/(2n). We have 


=1+1+ 1/2! + 1/3! tee + ifnl +e, 


+ tfarsre Bons (1-2) (Zoe 


whence we have 
e—(14+1/n)"=(1/n)/2!+[1—-—( —1/n)1 — 2/n)]/3!+---. 


Clearly all terms on the right of this last equation are positive, so their sum is greater 
than 1/2n. 


Also solved by M. T. Bird, D. R. Breach (New Zealand), Martin Burger, B. R. Caine, John Chris- 
topher, J. Gilles (Israel), M. G. Greening (Australia), Ellen Hertz, Michael K zstreva, Lew Kowarski, 


694 ELEMENTARY PROBLEMS AND SOLUTIONS [June-July 


e 
P. A. Lindstrom, Z. C. Motteler, Roger Opp, G. R. Phillips, M. R. Railkar (India), W. C. Sisarcick, 
Allen Stenger, Tau n Trent, M. K. Vamanamurthy (New Z2aland), Charles Wexler, R. H. Yarbrough, 
P. H. Young, and the proposer. 


On Spherical Triangles 


E 2363 [1972, 663]. Proposed by Htiseyin Demir, Middle East Technical 
University, Ankara, Turkey 


Characterize pairs of spherical triangles ABC and A’B’C’ for which A’ =a, 
B’=b, C'’=c,A=a', B=)’, C=c’'. 


Solution by M. G. Greening, University of New South Wales, Australia. For 
any spherical triangle A’B’C’ we have: 


(1) cos a’ = cos b’ cos c’ + sin b’ sin c’ cos A 

and the two others following from the permutations (a, b,c), (a,c, b). So 
(2) cos A = cos Bcos C + sin B sin C cos a, 

and so on. But from consideration of the polar triangle of ABC, 

(3) cos A = —cos Bcos C +sin B sin C cos a. 


Then cos BcosC = cosCcosA = cosAcos B = 0 and we have, say, A = B= 7/2, 
yielding a = b = 1/2 from (2). Also cosC =cosc, which must give C=c asc>0, 
C <x. Consequently 


A’=B'=a'=b)'=n/2andc'’=C'’=c=C, 
so that the two triangles are necessarily.congruent. 


Also solved by Michael Goldberg, Lew Kowarski, Clellie Oursler & Eric Sturley, and the pro- 
poser. 


Sylvester, Elliott, and Motzkin 


E 2365 [1972, 663]. Proposed (part I) by Erwin Just and Kenneth Fogarty, 
Bronx Community College, and (part II) by J. B. Wilker, University of Toronto 


I. Let S be a finite set of points in the plane in which no three points are collinear 
and not all points are concyclic. Define a common point of S to be a point which 
lies on some circle which passes through precisely two other points of S. Must each 
point of S be a common point? 

II. Let S be a set of four or more points lying on a sphere but not on a circle. 
Prove that each point of S is on some circle containing precisely two of the other 
points of S. 


1973] ELEMENTARY PROBLEMS AND SOLUTIONS 695 


Solution by M. G. Greening, University of New South Wales. Given n points 
not all of which are collinear there is always a straight line containing two but not 
three of the points. (This is Sylvester’s problem of collinear points, dealt with in 
Coxeter, Introduction to Geometry, 1st or 2nd edition, pp. 65 and 181.) 

I. Invert with respect to a circle with center se S. Then the image of S — {s} 
contains two points t, u not collinear with any other image point nor collinear with s, 
by Sylvester’s problem. The preimage of the line t’u’ must be a circle through 
s, tand u only, proving that s, and thus every point of S, is a common point. 

II. By the finiteness of S there is a point p€S such that its antipodal point qéS 
and neither p nor gq lies on any circle through three or more points of S. Then 
stereographic projection of S onto z, the tangent plane at p, maps S onto S’, which 
then satisfies the conditions of part I. Consequently each s’ of S’ lies on a circle t’ 
through three points of S’ but not through four. The bijective property of the pro- 
jection and the choice of p then prove that each sé S lies on a circle t through exactly 
three points of S. 


Also solved by A. Bruen, M. R. Railkar (India), R. R. Rottenberg (Israel), and the proposers. 


Editor’s Comment. In addition to the references listed on p. 65 of Coxeter, Rottenberg lists P. D. 
T. A. Elliott who, in his paper On the number of circles determined by n points (Acta Mathematica 
Academiae Scientiarum Hungaricae 18 (1967), 181-188), proved that in any set of n > 3 points in 
the Euclidean plane not all on a circle or line, each point lies on at least 2(7-1)/21 circles containing 
exactly two other points of the set. This proves part I. Part II is proved in T. Motzkin, The lines and 
planes connecting the points of a finite set, Trans. Amer. Math. Soc. 70 (1951) 451-464. More generally, 
he uses “‘plane”’ and ‘“‘convex surface containing no straight line” in place of our ‘‘circle’’ and “‘sphere”’ 
respectively. 


Not True for 7 = 8 


E 2366* [1972, 663]. Proposed by B. P. Gill, Demarest, N.J. 


Let V be the set of vertices of a regular 2n-gon and let A* and B* be the convex 
n-gons whose vertices are subsets A and B of V. If the set of lengths of all chords 
with both ends in A (with each chord length being counted according to its multiplicity) 
is identical to the like set for B, then is B* necessarily congruent to either A* or 
(V\A)*? 


Solution by Herbert Taylor, Skymount, Va. Not necessarily, as shown by the 
easily verified counterexample for n = 8 where A consists of vertices 1, 2, 3, 5, 6, 9, 
11, and 13, and B consists of vertices 1, 2, 3, 5,9, 11, 13, and 14 of a 16-gon. In each 
figure one finds 3, 4, 3, 5, 3, 4, 3, 3chords of the seven possible lengths, listed in order 
of increasing length. 


Editorial Note. Two attempted solutions were also received, one utilizing modular congruences 
to obtain an algebraic criterion to determine for which n the problem is or is not true. A correct 
algebraic solution is solicited. 


696 ELEMENTARY PROBLEMS AND SOLUTIONS 


Square Numbers in a Recursive Sequence 


E 2367 [1972, 772]. Proposed by Erwin Just, Bronx Community College 
Let F,, be the nth term of the sequence defined by 
F,= —F,_-,—2F,-2, F, =1, F, = —1. 
Prove that 2”** — 7F?_, is a perfect square. 


I. Solution by Trygve Breiteig, Kristiansand Laerarskole, Norway. By in- 
duction on n we prove that 


2") 7 FS, =(2F, + F,-1)’. 


Clearly this equation holds for n = 2, so we assume it correct for n, n 2 2. Then we 
have 


(2F 44 + F,)* = (- F,—4F,-1)° =F?2+ 8 FF, 4 + 16F°_, 
2(4F74+4F,F,,+F?_,)+14F?_, -7F? 
= 2(22"t'-7F?.,)+14F7.,-7F2=2"*? -7F?. 


II. Solution by C. R. Wall, University of South Carolina. By the recurrence 
relation we have F, = 0 and 


| F, my ; ~~ | a a 

Fy-1 F,, 1 —1 F, Pus 
F, mn) | ] " ne 

Le F, 1 Lo 1 1-1) . 


We take determinants to obtain F? — F,,,F,-; =2" ‘, multiply both sides by 4, 
substitute —F,,,=F, + 2F,-,, and subtract 7 F?_, from both sides to obtain 


so that 


(2F, + F,-1)” = 207} —7Fi-4. 


Also solved by the proposer and sixty-four others, many of whom submitted more than one 
solution. Most of the solutions made use of formulas for recurrence functions or diffzrence equations. 
There were a number of gen ralizations, the trend of which can be inferred from the following refer- 
ences“(given by our readers): 

B. W. Jones, The Theory of Numbers, p. 73, ex. 5. 

G. Polya, Mathematical Discovery, pp. 108-109, 195-197. 

J. Arkin, Convergence of the coefficients in a recurring power series, Fibonacci Quarterly, vol. 7, 
no. 1, February 1969. 

M. Ward, The arithmetical theory of linear recurring series, Trans. Amer. Math. Soc., 35 (1933), 
600-628. 


ADVANCED PROBLEMS 


All solutions of Advanced Problems should be sent to J. Barlaz, Rutgers — The State University, 
New Brunswick, N. J. 08903. Solutions of Advanced Problems in this issue should be typed (with 
double spacing) on separate, signed sheets and should be mailed before September 30, 1973. 
Contributors (in the United States) who desire acknowledgement of receipt of their solutions 
are asked to enclose self-addressed, stamped postcards. 


5916. Proposed by J. G. Mauldon, Amherst College 


Let p and q be coprime integers with 0 < p <q and let f(-) be a non-negative 

function defined on {0,1,2,---,p} such that f(0)=0, f(p)=1 and, whenever 

km, =q with O< m,S p (i=1,2,---,k), we have f(m,) S Lizy f(m)). Is it nec- 
essarily true that max {f(n): n = 0,1,2,---, p} < 3q? 


5917. Proposed by Andrzej Ehrenfeucht, University of Colorado 
Let f: [0,1] — R? be a curve such that f(0) = (0,0), f(1) =(1,0) and for every 
O<r<s<tsil we have 
max {| f(r) —f(5) |, |f) -fO |} Ss | -FO 


ie., f never approaches any point through which it went and does not move away 
from any point through which it will pass. Prove that such curves have length and 
that those lengths are bounded. Is 27/3 an upper bound? 


bd bd 


5918. Proposed by S. W. Golomb, University of Southern California 


Prove or disprove: Let C be a collection of distinct subsets of the positive integers 
which are totally ordered by the inclusion relation. Clearly C must be either a finite 
or a countable collection. 


5919. Proposed by L. A. Harris, University of Kentucky 
Show that if x and y are two vectors in a complex Hilbert space and if nis a 


positive integer, then 


2n 
L |x taty 
k=1 


msn (7) el? + [Po 


where A = exp(iz/n). Note that this generalizes the parallelogram law. 


“5920. Proposed by L. C. Washington, Princeton University 
Do there exist nonzero compact (Hausdorff) topological vector spaces over 
infinite fields? 


5921. Proposed by Paul Cohn, Bedford College, London, England 


Over any commutative field (e.g., the complex numbers), find two square matrices 


697 


698 ADVANCED PROBLEMS AND SOLUTIONS [June-July 


of the same order, A and B, such that every matrix in the pencil AA + wB is nilpotent, 
but A and B cannot be simultaneously triangularized. 


SOLUTIONS OF ADVANCED PROBLEMS 
o-algebras in X x X 


5845 [1972, 307]. Proposed by J.A. Johnson, Oklahoma State University 


Let X be an uncountable set and © the smallest c-algebra of subsets of X xX 
containing all sets of the form A x B where Ac X, Bc X. Does ¥ contain all 
subsets of X x X? 


Solution by Manfred Stoll and R. L. Taylor, University of South Carolina. 
If the cardinality of X is greater than the cardinality of the continuum, an example 
of a subset of X x X which is not in & is given on p. 261 of Halmos, Measure 
Theory, Van Nostrand Co., Inc. 1950. If one assumes the continuum hypothesis, 
the question has been answered affirmatively for all sets X with cardinality less 
than or equal to the continuum by B. V. Rao, On discrete Borel spaces and pro- 
jective sets, Bull. A.M.S., 75 (1969) 614. 


Also solved by E. J. Braude, R. E. Frankfurt, S. J. Garland, A. A. Jagers (Netherlands), Douglas 
Lind, Michael McCoy, and Mary Powderly. 


Generalization of the Riemann-Lebesgue Lemma 


5846 [1972, 307]. Proposed by H. Kestelman, University College, London, 
England 


If fe L(0,00) and I(A) is, for each positive 24, a subinterval of (0,00), then 
lim, +o raf(f) cos Atdt = 0. If I(A) is assumed only to be the union of a finite set 
of intervals, the result is false. 


Solution by David Borwein, University of Western Ontario. (i) Suppose I(A) = 
(a,,b,), and let 


Ss, = f(t) cos Atdt . 
I(A) 
Then 
by ba— n/a 
S, = | f(t)cosAtdt = -{ f(t + 2/A) cos Atdt; 
ay ay~ n/a 


and so 


1973] ADVANCED PROBLEMS AND SOLUTIONS 699 


2|S, 


ba aatn/A 
s | lro-setaajar+ [sear 


batn/A | 
+ | f(t)| dt 


= o(1) as A> oO. 


(ii) Suppose now that 


I(n) = UJ (os Lr rn), nat, 


r=1 \ 2n n 


and that f is the characteristic function of (0,22). Then 


f(t)cosntdt = & ; = 1. 
r=1 


I(n) 


Also solved by L. E. Clarke (England), A. A. Jagers (Netherlands), O.P. Lossers (Netherlands), 
J. E. Mazo & G. J. Foschini, and the proposer. 


Unique Fixed Points 


5847 [1972, 307]. Proposed by Joe Beasley, Prairie View A&M College 


X is a complete metric space and T: X — X is a function with the following 
conditions: 

(1) There is a sequence {x,}¢X such that d(x,, T(x,)) > 0. 

(2) t: X + R defined by t(x) = d(x, T(x)) is lower semicontinuous. 

(3) d(T(x), T(y)) S ad(x, T(x)) + bd(y, T(y)) + cd(x, y), where a, b, c are posi- 
tive numbers and c <1. 

Show that (A) T has a unique fixed point, and (B) none of conditions (1), (2) 


or (3) can be omitted. 


Note. This has already appeared in this Department. See 5775 [1972, 95]. 
New solutions were submitted by C. Crofts & W.R. Woodward, W.S. Hall, Rik Harris & 


Frank Oliva, Joel Levy, O. P. Lossers (Netherlands), Mark Rowles, T. Salat (Czechoslovakia), 
V. M. Sehgal, K. D. Stroyan, R. K. Tamaki, H. C. Wente, and the proposer. 


Zeros of z" + z™~-1 


5848 [1972, 399]. Proposed by A. Smith, Carleton University, Ottawa 


Let m and n be positive coprime integers. Find the number of zeros of the function 
z" + z™— 1 which lie inside the unit circle. 


700 ADVANCED PROBLEMS AND SOLUTIONS [June-July 


Solution by Nancy M. Huddleston and C. C. Rousseau, Memphis State Univer- 
sity. Let f(z) = z" + z™ —1 and let N denote the number of zeros of f(z) which lie 


within the unit circle. If z = e”, then 


(1) Re f(e’*) = 2cos (3) cos (ae) —1 
and 
(2) Im f(e%) = 2sin (a) cos (es) . 


From (1) and (2) it follows that f(z) has zeros on the unit circle only if n+ m=O 
(mod 6). If n + m40(mod 6), the principle of the argument can be applied and N is 
equal to the number of times w = f(e’”’) encircles the origin as 9 goes from 0 to 2z. 
When 6 =0, Rew =1 and Imw = 0. Setting up the conditions 


(3) Ref(e*)>0,  Imf(e’%) =0, 

and using (1) and (2), we find that d/d@ [Imf(e'*)] > 0 whenever (3) is satisfied. It 
follows that N is equal to the number of values of 0 (0 < 6 < 27) such that (3) is 
satisfied. Since, in order to satisfy (3), it is necessary to take (n + m)0/2 = kn, N is 
equal to the number of values of k (0<k <n-+~™) such that 


(4) (— 1)* cos (Car) > 


Equivalently, we want the number of ifttegers k (0 < k S$ n+ m) such that 


(n= mkn | 2% 


(9) nem | 3 


with k and | both odd or both even. Letting k+1=2rand k —1 = 2s, we write (5) 
in the form 


n+m 


(6) | ns — mr| < | , 


Since we require that (n,m)= 1, it follows that there exist solutions of ns— mr = p 
for every integer p. Moreover, for each such p, there is a unique solution which 
satisfies the condition 0<k<n-+m. Hence, if n + m0 (mod 6), the number of 
zeros lying within the unit circle is given by 


(7) a2 [2S] a1. 


If n + m= 0 (mod 6), it follows easily that f(z) has precisely two zeros which lie on 
the unit circle and these zeros are at e'*/3 and e~'** For this case, one can apply 


1973] ADVANCED PROBLEMS AND SOLUTIONS 701 


the principle of the argument to an appropriately indented contour and obtain 
(8) N = —~—- 1. 


Also solved by Robert Breusch, O. P. Lossers (Netherlands), St. Olaf College Students, and the 


proposer. 
Note. Breusch and Lossers express N as 2[(n + m— 1)/6] + 1 for all cases of » +m. The 


proposer notes that N is the nearest odd integer to (nm + m)/3, and is the lesser one if x + m = 0 
(mod 6). 


An Unsolved Approximation Question for 7 


5849 [1972, 399]. Proposed by H. D. Ruderman, Hunter College High School, 
New York City 


For positive integers n, what is the Greatest Lower Bound for n| sinn|? 


Comments by F. G. Schmitt, Jr., Berkeley, Cal. If the simple continued fraction 
representation of m2 is denoted by x=[4ay,4,,a5,:::], with convergents p,/q, 
= [a; Qi; dal, then [1, p. 163] | Pal Qn ~ |<1/dy4192, SO 


sin (Pp ~~ nl) | = Dn sin| Pa Qn% 


P,| Sin Py | = Dn 


< Pr [n+ 14n- 


< Pn| Pu Un™ 


But the convergents are bounded, so if the partial quotients a, are unbounded, then 
glb n| sinn| = 0. 

However, it is unknown if the a, are unbounded, although Lehmer (2) comments 
on their ‘‘seemingly lawless’’ behavior, and it is known (1, Thm. 196, p. 166) that 
almost all real numbers have unbounded partial quotients. The latest computation 
of the first 21230 values of a, (3) has failed to disclose one larger than a,;, = 20776, 
discovered by Lehman (4). Hence, since p,/q, <7 for n even, the best result to date 
from this point of view is glb n | sinn| < 2/20776. 


References 


1. Hardy, G. H. and E. M. Wright, An Introduction to Theory of Numbers, 4th ed., Oxford, 
London, 1965. 

2. Lehmer, D. H., Note on an absolute constant of Khintchine, this MONTHLY, 46 (1939) 148-152. 

3. Choong, K. Y., D. E. Daykin and C. R. Rathbone, Rational Approximations to x, Math. 
Conip., 25 (1971) 387-392. 

4. Lehman, R.S., A Study of Regular Continued Fractions, BRL Report 1066, Aberdeen Proving 
Ground, Maryland, February 1959. 

Additional comments were submitted by Richard Gisselquist, A. A. Jagers (Netherlands), and 


Steven Russ. 


MATHEMATICAL MONTHLY 


THE OFFICIAL JOURNAL OF 
THE MATHEMATICAL ASSOCIATION OF AMERICA, INC. 


VOLUME 80 NUMBER 6 


CODEN: AMMYAE 


Papers in the Foundations of 
Mathematics 


Number 13 
of the 
“HERBERT ELLSWORTH SLAUGHT 
MEMORIAL PAPERS 


JUNE-JULY 1973 


NOTICE TO AUTHORS 


Specialized research is usually unsuitable; scc Statement of Policy (vol. 76, p.2). Manuscript preparation: Please 
use the Manual for Monthly Authors (vQlic18,:Ps: AD. and. follow: the format in. current iszues of the MonTHLY. 

Manuscripts should be typewritten, ‘triple-spaced. ‘with. wide: margins; ‘submit two copies and keep one for 
protection against loss. 

Backlog: Main Articles 12: months, Math; Notes 13 months; Research Problems'7 months, Classroom Notes 
11 months, Math. Education 10 ‘months. 


EDITORIAL CORRESPONDENCE AND MAIN ARTICLES: tOALEX ROSENBERG, Department of Mathe- 
matics, Cornell ‘University, Ithaca, _ € ‘the’ ‘corresponding’ Associate Editor; 
ADVERTISING CORRESPONL JENCE: to’ RA HAILPERN, Mathematical Association of America, 
SUNY at Buffalo, Buffalo, N. ¥. ‘14214 ‘CHANGE. ‘OF “ADDRESS ahd SUBSCRIPTIONS: to A. B. 
WiLLcox, Mathematical Association of America, 1225 Connecticut Ave., N, W.,; Washington, D.C. 20036 


HARLEY FLANDERS, ° Editor 
ALEX ROSENBERG,’ Editor-Elect 
ASSOCIATE EDITORS — 


JOSHUA BARLAZ J.G.HARVEY SEYMOUR SCHUSTER 
E.R, BERLEKAMP ERIC S. LANGFORD J, ARTHUR SEEBACH, Jr. 
JANE W. DI PAOLA P, D. LAX E. P. STARKE | 

ROBERT GILMER - ARTHUR. MATTUCK LYNN A. STEEN 
RICHARD GUY M. W. POWNALL JAMES WENDEL 

RAOUL HAILPERN GIAN-CARLO. ROTA 


Annual dues for members of. the AssoniatoT (including a subscription to the American 
Mathematical Monthly) are: [Fore efS thé Subscription price is $18.00. 


PUBLISHED BY THE ASSOCIATION at Washington, D. 
February, March, April, May, June-July; A 


and Menasha, Wisconsin, during the months of January, 
t=September, October, November, December. 


Second-class postage ‘paid at Washington, D.’C., and additional mailing offices. 
Copyright © ‘The: Mathematical ‘Association of America (Incorporated), 1973. 


PRINTED IN THE UNITED STATES OF AMERICA. 


PAPERS IN THE FOUNDATIONS OF MATHEMATICS 


CONTENTS 
Preface . . . . . eee COSC. ABBOTT 1 
Some Aspects of the Theory of Models . . . . Robert L. VAUGHT 3 
What is Nonstandard Analysis? . . . . . . W.A.J. LUXEMBURG 38 
Recursive Functions and Hierarchies . . . . . HILARY PUTNAM 68 


Function Theory on Some Nonarchimedean Fields . ABRAHAM ROBINSON 87 


The Thirteenth 
HERBERT ELLSWORTH SLAUGHT 
MEMORIAL PAPER 


Published as a supplement to the AMERICAN MATHEMATICAL MONTHLY 


Volume 80 June-July 1973 Number 6 


PREFACE 


The papers in this volume of the Slaught Papers were first presented to a Sym- 
posium at the United States Naval Academy in May of 1968. The purpose of the 
symposium was to provide a service to the Mathematical Community by presenting 
various leaders in the broad area of ‘‘Foundations of Mathematics”’ affording them an 
opportunity to explain the then current situation of their particular specialty and to 
provide a forum in which they could present their thoughts concerning future 
trends. Mathematics today and Foundations in particular is expanding so rapidly 
that it is difficult for even the expert to keep abreast in all but a narrow corner of a 
vast realm. At the same time there is a greater need than ever for the general mathe- 
matician and particularly the teacher of mathematics to keep posted on what is 
going on in the various fields of mathematics. It was for this purpose that four of 
the country’s leading authorities within the general area of-Foundations were invited 
to give their considered views within their particular specialty. These speakers were 
selected not only because they possess the most intimate knowledge of their field, 
but because they are also known as great teachers themselves, who have a desire 
to share their expertise with others. 

The first paper on Model Theory is by Professor Robert Vaught of the University 
of California at Berkeley. Professor Vaught obtained his Ph. D. degree in 1954 
under the direction of Alfred Tarski at the University of California at Berkeley. 
After four years at the University of Washington he returned to Berkeley where 
he has been a Professor since 1963. He served as a Fulbright Research Scholar in 
Amsterdam (1956-57), a senior post-doctoral NSF Fellow at UCLA (1963-64), and 
a Guggenheim Fellow at Ziirich (1967). Aside from the theory of models Professor 
Vaught has made important contributions to the foundations of set theory and to 
the theory of decision methods. 

The second paper is on Non-Standard Analysis by Professor W. A. J. Luxemburg 
of California Institute of Technology. Professor Luxemburg received his doctorate 
from Delft in 1955. He was a Fellow at Queens in 1955-56 and an Assistant Pro- 
fessor at Toronto (1956-58) and has been a Professor at California Institute of 
Technology since then. His early interests were in functional analysis with emphasis 
on measure and integration theory, Banach spaces, locally convex spaces and Riesz 
spaces as well as ordinary differential equations and topological linear spaces. More 
recently he has done much work in non-standard Analysis, Boolean algebra and 
axiomatic set theory. 

The third paper in this series is on Recursive Functions by Professor Hilary 
Putnam of Harvard University’s Department of Philosophy. Professor Putnam re- 
ceived his Ph. D. from the University of California at Los Angeles in 1951. He 
was Instructor at Northwestern University (1952-53), Assistant Professor of 
Philosophy at Princeton (1953-60), where he also gave seminars in logic and the 
philosophy of science at Swarthmore, Associate Professor, Princeton (1960-61) and 


l 


2 PREFACE 


Professor of Philosophy of Science at MIT (1961-65). He has been Professor of 
Philosophy at Harvard since then. Professor Putnam was also a Rockefeller Founda- 
tion Research Fellow, 1951-52, Visiting Research Professor at the Minnesota Center 
for the Philosophy of Science (1957) and a Guggenheim Foundation Fellow, 1960-61. 
Professor Putnam has done research in the philosophy of science leading to articles 
on physical geometry, quantum mechanics, the analytic-synthetic distinction and 
its role in physical theory and various aspects of artificial intelligence, philosophical 
behaviorism and contemporary linguistic theory. His papers in logic include papers 
on three valued logic, decidability, set theory, recursive functions and constructable 
sets amongst many others. 

The final paper in this volume by Professor Abraham Robinson on Recent 
developments in Non-Standard Analysis was presented as a supplement to Professor 
Luxemburg’s paper at the Naval Academy in the Spring of 1970. Professor Robin- 
son received his doctorate in 1949 from the University of London. His original 
training was in mechanical and aerodynamic engineering, having served as an engi- 
neer with the Royal Aircraft Establishment from 1942-1951. He was at the Uni- 
versity of Toronto for the period 1951-57 and was Professor at Hebrew University, 
Israel, 1957-62. For the period 1962-67 he was Professor at the University of Cali- 
fornia at Los Angeles in Mathematics and Philosophy and went to Yale in 1967 
where he has remained since. At various times, he has been in a visiting capacity at 
Princeton, Oxford, Paris, Rome, Tiibingen and Hebrew University. His early work 
was in wing theory and aerodynamics. Later he turned to model theory, the mathe- 
matics of algebra, mathematical logic and non-standard analysis with applications 
to analysis and algebra. He is at present President of the Association for Symbolic 
Logic. 

We wish to thank the Office of Naval Research for their financial support of the 
Symposium. 

J.C. ABBOTT 


SOME ASPECTS OF THE THEORY OF MODELS 
ROBERT L. VAUGHT, University of California at Berkeley 


In modern mathematics many of the questions studied are of the form: does 
a certain mathematical structure possess a given property? In the theory of models 
one studies the collection of all properties or, rather, some restricted but large col- 
lection of properties. Usually these are just the properties expressible in a certain 
language. 

For example, each of the properties of being a group, an Abelian group, or a 
torsion-free Abelian group is expressible in the so-called elementary language (or 
first-order predicate calculus). Thus, instead of saying the group ¥ is Abelian, we 
can say it is a model of the elementary sentence VxVy(x 0 y = yo x). Such prop- 
erties are also called elementary. 

It turns out that there are results applying to all elementary properties which 
are so strong that we may have been unaware of their instances in simple cases 
like those above. Apparently the result holds just because of the special way the 
property can be expressed, and we had not previously been paying attention to that 
aspect of the property. Perhaps the most fundamental such result is Gédel’s Com- 
pactness Theorem: If 6,,::-,¢,,°:- are elementary sentences and for each n, 
{a1,°+:;0,} has a model, then {o,,:+:,6,,°::} has a model. (This is proved, in a 
more general form, in §2.) 

In this article we introduce the reader with little or no background in logic 
to the basic results of model theory and to a selected group of advanced or recent 
topics as well. Specifically, in the first three sections the basic notions of model 
theory and the Compactness and Léwenheim-Skolem theorems, as well as their 
applications, are treated. In the last four sections, we discuss saturated models, 
ultraproducts, the @-completeness theorem and Léwenheim-Skolem theorems for 
two cardinals. Of course very many important topics are omitted, but those we 
do cover hang together closely and give a good idea of the flavour of the subject. 
Full proofs are given for all theorems, except for a few results mentioned on the side. 


1. The elementary language. Groups are structures of the form (G, 0). Some 
other structures of different forms are ordered sets (A, <), ordered rings with unit 
(A, +, -, <,1), etc. Some structures, such as groups with operators, even have 
infinitely many distinguished operations (or distinguished relations or distinguished 
elements). In general, then, to determine a ‘similarity type’ of structures, we are 
given an arbitrary set S of ‘symbols’, each of which is classified as an n-ary opera- 
tion symbol (for some n 2 1), or as an n-ary relation symbol, or as an individual 
constant. is a structure of similarity type S (or S-structure) if 29 = (A, F), whe- 
re A is a non-empty set and F is a function on S such that foreach X €S, F(X) 
is an n-ary operation over A or an n-ary relation over A (i.e., a set of n-tuples of 


3 


4 ROBERT L. VAUGHT [June-July 


elements of A) or a member of A, according to which kind of symbol X is. We a- 
gree to write X™ for F(X) and to write U = (A, X")y-5. (If Sis finite and we fix a 
listing X,,---, X, of its members, then we can write YW = (A,X Hee, X) as in the 
examples above.) A is called the universe of 2[. We always understand that ‘A’ 
denotes the universe of YW, ‘B’ that of B, etc. W determines S$, so we can write 
S = S(M). Incidentally, ‘c’, “d’ will always denote individual constants, ‘O’ an ope- 
ration symbol, and ‘P’, ‘Q’ relation symbols. 

The elementary language L(S) has the non-logical symbols X (for X eS) and 
the logical symbols ~, A,4, ®, and v9,0,,-°°,v,,°°* (called variables), as well as 
(,). ‘u’, ‘v’, ‘w’ always denote variables. The symbols are put together in the obvious 
ways, but it will be helpful to have the precise definitions: The set of terms of L(S) 
is the smallest set containing all variables and individual constants and containing 
Ot, ---t, whenever it contains t,,-:-,t, and © is an n-ary operation symbol in S. 
‘c’ always denotes a term. If P is an n-ary relation symbol in S and 7,, -:-, t, are terms 
then Pt, ---t, is an atomic formula of L(S). (* is considered to be a binary relation 
symbol.) The set of formulas of L(S) is the smallest set containing all atomic formulas 
of L(S) and containing ~¢, (@ A W), and Jv,d whenever it contains @ and w. 
0, @, w always denote formulas. The precise definition points up the inductive char- 
acter of being a formula, which is of basic importance in proving results about 
the properties corresponding to formulas. 

We write ¢ v W to mean ~(~@ A ~w); and >, «, Vu,, and J!v, (read ‘there 
exists exactly one v, such that’) are understood similarly. But it should be remem- 
bered that in L(S) itself only ~, A, ~, and J occur. 

For some examples, let us now consider groups as structures of the form 
G = (G,0,e), of type Sy = {0,e}. (It is not necessary to have e, but it simplifies 
various formulas below.) The formula @: Vv,(v90v, % v,0v9) has vo as its only 
free variable. Formulas with no free variables are called sentences. o will always 
range over sentences. 

We cannot ask whether the formula @ above is true (or holds) in the group 9 
until we also have at hand a particular element b of G (to be ‘denoted’ by vo). Given 
both Y and b we say that ¢ is satisfied in Y when b is assigned to vy, and write 
Gio['?], just if, in fact, box = xo b for all xeG. (The term ‘elementary’ or 
‘first-order’ refers to the fact that in interpreting satisfaction in, say, Y, all quantified 
variables range over the elements of G, and not over subsets or subrelations of G 


(second order) or natural numbers or whatever.) In a similar way we understand 
the notation 


u UU, 
atu 0 | 
QAg:*: Qn-1 


where a,;€A and wuo,---,u,—, are distinct and include all free variables of ». For 
a sentence o we write simply 2+ o and say a is true in W or AW is a model of c. 


1973] SOME ASPECTS OF THE THEORY OF MODELS 5 


Strictly speaking, the general notion of satisfaction (like that of formula) is de- 
fined by an inductive procedure, with one clause for atomic formulas, and one each 
for ~, A, and 4. The clause for ~ is 


wool 


0°" 


if and only if it is not the case that 


by |. 


ao°*° Qn-1 


From this example the reader should be able to supply the other clauses himself 
(and should do so). Actually, when operation symbols are present we must first 


u eee u _ e . e 
define "| oa" | , the value of the term t when q; is assigned to u, (i <n). 


ot An-1 
This is the same as the usual ‘value’ of a polynomial in algebra. 

When wo,-:-,u,-1; are exactly v9,---,v,-; we write simply Wt@[do,-::,a,-1] 

instead of Wd b "Onn 

Ag**' An-1 
is called an n-formula. It creates an n-ary relation y™ in a Structure 2, namely, 
W™ = {(do, +++) dn—1): UEW [ao, -**,4,—1]}. An n-ary relation over A is called definable 
in YW just if it is of this form for some w. Returning to the 1-formula ¢: 
Vv,(v; 0 Ug © Up O U,), we see that for a group Y, ¢% is just the center of the group 
(the set of elements which commute with all elements). 

It is clear that we can write down a familiar sentence (the conjunction of three 
or four sentences) such that a structure Y = (G, 0, e) is a group if and only if it is 
a model of o. To indicate this state of affairs we say that the property of being a 
group is elementary, or that the class of all groups is &@ . Consider now the property 
of being a torsion-free group, i.e., a group (G, 0, e) in which 

(1) for any x # e, and any positive integer k,x* # e. 

It does not appear that we can state (1) equivalently as an elementary sentence, 
since it has a quantifier ranging over positive integers (and indeed it can be proved 
impossible — see end, §2). However, we can paraphrase (1) by an infinite set 2 
of sentences. 2 consists of Vulu #e>uocu fe), VuluFeruo(uoun) #e), 
and so on. Thus, the class % of all torsion-free groups is the class Mod’ of all 
models of a certain set L’ = 2 U {co} of elementary sentences. (Being a model of 
a set of sentences means being a model of each.) We say % is elementary in the 
wider sense or €@,. It turns out that such “% are almost as well-behaved as the 
elementary. 

Another property of groups is that of being a torsion (or periodic) group, i.e., 
one in which every element is of finite order (for any a, a” = e for some positive 


| .A formula y whose free variables are among vo,-:-,v,—1 


6 ROBERT L. VAUGHT [June-July 


integer n). This property is known (see end, §2) not even to be elementary in the 
wider sense. However, it can be expressed by a single sentence of another language 
called L,,,;,, which we shall consider later. Such properties will be called ¢(L,,,..) . 

Before giving more examples, we need to introduce another classification of 
properties, called A@,. If W is an S-structure and S’ < S then the reduct QW ['S’ 
is simply the structure <A, X Wve s’. (An example is the additive group of a ring.) 
In the reverse direction, the structure ®B = (YU, Y;);-;, where each Y, is a finitary 
relation or operation over A or an element of A, is just obtained by adding or ad- 
joining the Y, to all the X"(X eS); thus B [| S = A. (Strictly speaking we must first 
make up new symbols of the appropriate type and add them to S to get an S” 2S 
such that 8 is an S”-structure.) 8 is called an expansion of 2. 

If #” is an @@,-class of S’ structures and S’2S, then the class 
HPS={UlS: We H#} is called a P@,-class (pseudo-elementary class). Clearly 
in defining “ we use an existential second-order quantifier and even more (We 
if there exist relations or operations X¥(X e S’—S) over A such that...). Therefore, 
as one would expect, it can be shown that P@, is a broader classification than 
€€ , (see below). Nevertheless it turns out that A@,-properties have many of the 
features of elementary properties. (The notions &@, &@,, and Fe, were all intro- 
duced by Tarski. See [14] for an excellent general discussion.) 

For some more examples, consider the class “ of all well-orderings (A, <), 
and the class %”’ of all orderings which are not well-ordered. It is trivial to verify 
that #' is FP@,. H' is not &@, (see end, §2). On the other hand, to say YW is well- 
ordered (every non-empty subset has a first element) requires universally quantifying 
over all subsets of A. This appears to be very strong, but of course we must beware 
of the possibility that some simpler, equivalent definition of “ can be given estab- 
lishing that % is, say, &@, or EC(L,,,.). Proving this is not so is sometimes difficult 
arid there are many interesting problems of this form. As regards “ , however, it 
is not P@, (see end of §2) and it is not (unlike ‘torsion group’) even F@,(L,,.) 
(see [8]). There is, however, a very strong language, L,,,,,,, which has been studied, 
in which being well-ordered is expressible. 

We need a little more syntactical notation. If @ is a formula, uo,---,u,-, are 


Uo eee Uun-1 


distinct variables and To,---,t,—, are terms then @ ( is the formula ob- 


To eee Th 1 
tained from @ by simultaneously substituting t; for each free occurrence of u; (i < n) 
(after first changing bound variables so that no variables in the t; occur bound in ¢). 


@(To *** T— 1) Means (2 an . We write Xt @, and call ¢ a consequence of 2 , 
0 n-1 
if @ is true in all models of 2; if @ is not a sentence it is understood to be replaced 
by its universalization (Vuy ---u,_,)@, where uo,---,u,-, are all the free variables 
of ¢. @ is logically valid if | d(i.e., OF @). “Z’ always denotes a set of sentences. 
A binary operation (for example) is clearly essentially the same thing as a ternary 
relation for which the sentence (Vu, v)S!w Puvw holds. Moreover, it is easy to check 


1973] SOME ASPECTS OF THE THEORY OF MODELS 7 


that any elementary sentence involving a binary operation symbol © can be ex- 
pressed as an elementary sentence about a corresponding P. For this reason, we 
can assume in what follows that S contains only relation symbols, or only relation 
symbols and individual constants (which are really O-ary operation symbols), and 
nevertheless, when we want to, apply our results at once to structures of arbitrary 
similarity type. 

We are now ready to prove some theorems of model theory. However, many 
of these theorems and their proofs require an (elementary) use of ordinals and 
cardinals, so we close §1 with a brief discussion of set-theoretical notions. 

An ordinal « is the order-type of a well-ordered set. It is convenient to interpret 
ordinals so that a = {B: B<a}. ‘&’, ‘n’, ‘a’, ‘B’, ‘y’ always denote ordinals. w is 
the first infinite ordinal and the set of all natural numbers. The cardinal number 
X of a set X is the smallest ordinal « such that « and X can be put in one-to-one 
correspondence. (‘x’, ‘A’, ‘uw’ always denote cardinals.) The infinite cardinals are 
written in order as @ = No, %1, N2,°°:. K* is the first cardinal greater than x (so 
(N,)" = N41). « is regular if it is not the sum of a smaller number of smaller 
cardinals. Any cardinal of the form &,,, is regular; &,, is the first cardinal which 
is not regular. 


2. The compactness theorem. We shall now prove the basic theorem of model 
theory: 


2.1 COMPACTNESS THEOREM. If every finite subset of X (a set of S-sentences) has 
a model, then so has x. 


2.1 was proved by Gédel in 1930 for countable S, and extended later by Malcev, 
Henkin, and A. Robinson to arbitrary S. The languages L(S) with S uncountable 
might be considered ‘imaginary,’ but there is no difficulty in considering them 
mathematically and it turns out that they are very useful as a tool in obtaining 
results about ‘real’ languages (with finite or denumerable S). 

To save time we agree to say that & is finitely satisfiable if every finite subset 
of & has a model. 

In ordinary mathematics, speaking roughly, when we arrive at the knowledge 
that 4xd(x), then we usually find it helpful to introduce a name for such an x by 
saying: ‘let a be such an x’, “choose such an x’, etc. Later on we may do the same 
thing for another formula JyW(y), and so on. In the first step toward proving 2.1, 
we now do something formal analogous to this—in a systematic way covering 
all ¢. However, in order to avoid having to ‘know that 4x¢(x)’, we pass to the 
formula Jy(Ax¢(x) > $()), which clearly always holds. 

Now speaking precisely, let Ju@ be an S sentence, c an individual constant not 
in S, and YW an S-structure. Then clearly there is an aeA_ such that 
(YM, a)t Jud > d(C). Thus 

(1) Any 2 can be expanded to a model of Jud > (0). 


8 ROBERT L. VAUGHT [June-July 


We now want to form many new sentences like Jud — @(c) (each with a new c), 
one corresponding to each @, even including @’s involving new c’s already intro- 
duced earlier. One way of doing this precisely is as follows: Let k = max(No, S). 
Let c, (€ <x) be distinct constants not in S, and let S’= S U{cs:& <x}. By 
simple cardinal arithmetic there are exactly x S’-sentences of the form Jw@. Let 
Jugdos***» tUshz,-- (¢ < K) be a list of all of them. By recursion, for ¢ <x, let d, 
be the first (in the list co,c,,---¢,,---) such that d, differs from d, for all € < € and 
d, does not occur in @; for ¢ S ¢. (Thus d, is ‘new’ when introduced.) Now: 

(2) Let As be the set of all the sentences Jud, > 6.4%) for ¢ <x. 

By applying (1) repeatedly (with various S’s) we obtain 

(3) Any S-structure can be expanded to a model of any finite subset of A, 
(and indeed of the whole A,, though we never need this). | 

Hence clearly: 

(4) If 2 is a finitely satisfiable set of S-sentences, then X UAs is also finitely 
satisfiable. 


We are now ready for the second and last step in proving 2.1. In fact, by (4), 
2.1 will be established if we can show, 


LEMMA 2.2. If & UAsg is finitely satisfiable, then it has a model A. 


Proof of 2.2. Throughout we are considering L(S’). By Zorn’s lemma, we can 
extend & UAg to a maximal finitely satisfiable set X’ of sentences. Note that (a) 
for any o, either cEx’ or ~a Ex’. Indeed, if not, there are finite X,,2, < &’ such 
that 2, U{o} has no model and 2, U{~o} has no model; hence X, UZ, has no 
model, a contradiction. Secondly, note that (b) if oo,-::,o,.,¢2' and 
{oo,°*;0,-1; o then oex’. For if not, then (by (a)) ~oeX’ and 
{6o,°*'yTn-1 ~ O} is a finite subset of X’ having no model. 

Put C = {c.:€ <x}. If c,deC, let cEd if and only if the sentence c = d be- 
longs to &’. Then E is an equivalence relation on C. Indeed, suppose, for example, 
that cEd and c’Ed. Then c = d and c’ = d belong to ~’ so, by (b), c = c’ EX’ 
and so cEc’. Our model Y& for X’ will have as its universe A, the set of all equivalence 
classes € = c/E for ce C. The denotation c™ of each c will be €. If PeS, P n-ary, 
then we put: P™d, .-- d, if and only if Pd, ---d,€2’; that P™ is well-defined follows 
easily again from (b). As remarked already, we can assume that S contains only 
relation symbols, so %& is now completely defined. 

Finally, we show that (c) for any o, cE’ if and only if Wto. (Of course, it 
follows that 2 is a model of Z UAg as desired.) (c) is proved by induction (on the 
number of occurrences of J, A, ~ in a). The case when a is atomic has really been 
dealt with already. The cases when o is a negation or conjunction, and one direction 
when a is Jud are argued again using (a) and (b) (and inductive hypotheses as below). 
We illustrate by doing the remaining case. Suppose o is Ju@ and oex’. Then we 
use the fact that, by its construction, As (and hence 2’) contains some sentence 


1973] SOME ASPECTS OF THE THEORY OF MODELS 9 


Jud — (2). Since also Jud EX’, we see from (b) that (2) EL’. So, by the inductive 
hypothesis, 2+ o(%), whence Wt Jud as desired. 
For later use, we remark that the proof of 2.2 actually showed a little more, 


namely: 
2.3 In 2.2, M can be found such that A = fcr:€ <k}. 


In the argument above we have not really used the full (b), but only a few much 
simpler principles, such as: if both 0, and 0, —> a, belong to x’, then a, EL’; and: 
every sentence $(7) > Jud eX’. If these are systematically formulated and the whole 
proof of 2.1-recast somewhat, a proof is obtained for Gédel’s Completeness Theorem 
(in fact the proof by Henkin; see e.g., [3]). This famous theorem says, roughly, 
that all consequences of a set of sentences can be generated by certain natural 
‘axioms’ and ‘rules of inferences’— which are thus ‘complete’ for this purpose. 
In a considerable part of model theory (and in almost all of this article) compact- 
ness is central but completeness does not enter. Therefore, we have given a direct 
proof of compactness, avoiding the refinements needed to obtain also completeness. 
A second proof of compactness (also direct) will be given in §5. 

It is worth remarking explicitly that an equivalent form of the compactness 
Theorem 2.1 is: 

(5) If Zo then for some finite &’ ¢ &, L’ ka. 

In later sections we shall use the Compactness Theorem several times to obtain 
various general theorems of model theory. But we could give right now a large 
number of direct special applications. A typical example is a proof of the existence 
of a non-Archimedean ordered field 21 = (A, +,:,<,1). & is Archimedean if for - 
any aeéA then exists a positive integer n such a<no 1. Let o be the conjunction 
of the usual (clearly elementary) axioms for ordered fields. Let c be a new individual 
constant and let 2 consist of o together with the (infinitely many) axioms 


n times 


l<c,l+l<c,-,14+°-+1<c,::. 


It is clear that for any finite subset &’ of & we can find a real number r such that 
(Reals, +,°, <,1,r) is a model of X’. Hence by 2.1 2 has a model (4, +,-, <, 1, a), 
and (A, +,:°,<,1) is clearly a non-Archimedean ordered field. By replacing o in 
this argument by the (infinite) set of all sentences true in 2%, = (Reals, +,-, <,1) 
we even get a non-Archimedean ordered field having exactly the same true elementary 
sentences as the reals Y,. 

Non-Archimedean ordered fields can be constructed in other ways. However, 
these require at least a little special knowledge or ingenuity, while the above argument 
is extremely simple and general. The reader will have no trouble after seeing it in 
finding many interesting similar arguments. Incidentally, in these arguments, some- 


10 ROBERT L. VAUGHT [June-July 


times no new individual constant is needed, sometimes infinitely many (for example, 
to create a descending sequence). 

Show, for example, that (the property of being a) ‘torsion-free group’ is not 
&@. Or that ‘torsion group’ is not &%, and not even F@, (very easy). Show that 
‘well-ordered’ is not A@, (very easy). Finally, show that ‘not well-ordered’ is not 
&@,. (Hint: show that the set of all sentences, true in (@, <), has a non-well-ordered 
model.) For harder problems, consider the notions ‘free group’ and ‘simple group’. 


3. Liwenheim-Skolem Theorems. Let 2 and % be S-structures. 2 and % are 
called elementarily equivalent, written 2[ = 8, if they have the same true ele- 
mentary sentences. For an easy but instructive example, consider the three non- 
isomorphic groups (Integers, +), (Rationals, +), and (Non-zero reals, - ), and verify 
that no two of these are elementarily equivalent. 

We have just seen that there is an ordered field which is elementarily equivalent 
to the reals YM, but, being non-Archimedean, is certainly not isomorphic to Wp. 
Thus, in general, = is definitely weaker than ~. However, if QF (i.e., A) is finite 
and also S is finite, it is trivial to write down a single sentence o true in 2 such that 
any model of ao is isomorphic to YJ. Even if S is infinite, but A is still finite, every B 
elementarily equivalent to 2 is isomorphic to Y. (Inferring this from the (preceding) 
result for finite S is amusing, though not difficult.) On the other hand, we shall 
see in 3.3 below, that for every infinite Q, this fails—indeed, for each infinite 
«(= S), there exists a structure B of power « which is elementarily equivalent to YW. 


There is another very useful notion related to but stronger than elementary 
equivalence. As always, YW is called a substructure of 8 if A < B, A is closed under 
the (distinguished) operations of 8, and each relation or operation of YW is the 
natural restriction to A of the corresponding relation or operation of 8. We write 
Y= B |A. YW is called an elementary substructure of 8 (written 92 <%B) if in ad- 
dition: for any n, any do,---,a,-,¢€A and any n-formula 0, Wt O[ do, ---,a,-,] if 
and only if Bt 6[ao,---, a,-;]. Clearly & <8 means that Y is a substructure of 
8B and W = B, but it means more too. Thus, (@ — {0}, <) is even isomorphic to 
(w, <) but is not an elementary substructure (because ‘is the first element’ can be 
expressed by an elementary formula). 


LemMMA 3.1. Let WU be a substructure of 8. Suppose (*) for any ao, °*',4,-1 EA, 
any beB, and any (n+1)-formula ¢, if Bt d[ao,-:-,a,-,,5] then for some 
a,EGA, Bk dao, --:,a,-1,4,]. Then U<B. 


The reader will easily establish 3.1 by induction on the formula 0 (in the def- 
inition of <). 

The converse of 3.1 is almost immediate. Thus (*) gives a characterization of 
MW ~< $8 which refers only to satisfaction in 8. Indeed, (*) can be regarded as saying 
(as in algebra) that A is a closed subset of 8 in a certain sense (with respect to all ¢). 


1973] SOME ASPECTS OF THE THEORY OF MODELS 11 


The next theorem is perhaps the oldest theorem in-model theory, and one of 
the most fundamental. The first version appeared in 1915. 


THEOREM 3.2 (Downward Léwenheim-Skolem Theorem). Suppose ® is an 
S-structure, « is an infinite cardinal, and S $ x S$ B. Then 8 has an elementary 
substructure X of power k. 


The proof is little more than the proof, say, that any countable subset of a group 
generates a countable subgroup. Choose a subset X of B of power xk. Now well- 
order B. Then each, say, (n + 1)-formula ¢ determines an n-ary (partial) operation 
taking any bo,---,b,-, into the ‘first’ b such that Bt @[bo,---,b,-1,5]. By 3.1, if 
we take for A the closure of X under all these operations, then WU(=B | A) <8. 
Since there are at most « of our operations, clearly A = k. 


If, in addition to the hypothesis of 3.2, a subset Y of A of power <x is given, 
then 8% can also be required to include Y. This is clear from the proof, or can be 
inferred by applying 3.2 to the structure (UW, y),-y-. 

In the simplest case, 3.2 says that any infinite 8 (whose S is countable) has a 
countable elementary substructure [. If we take for 8 the class of all sets together 
with the e-relation, then we obtain a denumerable ‘world’ %& of sets in which all 
the axioms of transfinite set theory hold —a state of affairs sometimes called Skolem’s 
‘paradox’. Though there is apparently no real paradox, the existence of such an YW 
is of basic importance in the remarkable work of Gédel and Cohen on the consis- 
tency and independence of the continuum hypothesis and other basic propositions 
in set theory. 

It is very significant that in 3.2 the smaller structure 2{ which is obtained is not 
only elementarily equivalent to 8 but is also a substructure of 8 (and even an 
elementary substructure). Suppose, for example, that 8 = (B, <), where < well- 
orders B. As we have seen, a structure can be elementarily equivalent to 8 but not 
well-ordered. But obviously any substructure of a well-ordering is a well-ordering, 
so 3.2 gives a countable well-ordering 2 elementarily equivalent to 8. (And in the 
set theory example, the ‘ordinals’ of the countable ‘world’ are really well-ordered.) 

Another important fact is that one can always add relations to 8 before applying 
3.2. A good example of this will be seen later on in the beginning of the proof of 
7.1 (where < is adjoined to %). We shall next prove a strengthening 3.2’ of 3.2 in 
which the elementary language Lis replaced by a richer language. It will be quicker 
(and also instructive) to prove 3.2’ anew by imitating the proof of 3.2. However, 
we could instead infer 3.2’ directly from 3.2, by systematically employing the device 
of adding new relations to 8 before applying 3.2. (Lemma 7.5 in §7 tells just what 
relations to add.) 

We now consider for the first time a language richer than the elementary language. 
The language L,,,,.(S) has all the symbols of L(S) and in addition the symbols 
and /, which are used in forming countably infinite disjunctions and conjunctions. 


12 ROBERT L. VAUGHT [June-July 


(Thus this language is certainly ‘imaginary’, as a single formula can be infinitely 
long.) The definition of formula (of L,,,,.(S)) is obtained from the old definition 
by adding the clause: if, for some k, <@o,-:-,¢,,°*: > is an arbitrary sequence of 
k-formulas, then \(@o,---,@,,°°:) and /A(@o,--:,9,,°°:) are formulas. (Strictly 
speaking, this means that the set of formulas is defined as the smallest set K meeting 
the old conditions and closed under the formation of disjunctions and conjunctions 
as above. For convenience we often write, say, the disjunction \/ (¢o,---,@,, °**) as 
V Pn OF as do V $1 VO2 V 

As an example, we can Say in a single sentence of L,,,,, that a group Y = (G,o, e) 
is a torsion group. The sentence is: 


VuluyevucuRevucuouve v--). 


The clearly intended meaning for satisfaction and truth is made precise just as 
before and the notation | is retained. Notice that a restriction was made in defining 
‘formula’ so that every formula (of L,,,.) has only finitely many free variables. 
(Those left out could never have been parts of sentences anyway.) 

Even if S is finite, L,,,.(S) has uncountably many formulas, so we usually must 
content ourselves with looking at ‘fragments’ (i.e., subsets) of the set of all formulas 
(which can already be extremely rich). In particular, a formula ¢@ determines a very 
important fragment, the set of all subformulas of @. This notion has an obvious 
meaning, but formally, the definition is inductive: The subformulas of ~@ are 
~ @ and those of ¢; the subformulas of \ ,@, are \/,@, and all subformulas of any 
g,; etc. By induction (based on the inductive definition of ‘formula’), the set of 
subformulas of any formula @ is immediately seen to be countable. Now let ® be 
any set of formulas of L,,,.(S). We write 20 < (®)% to mean that for any ¢ € ® and 
ANY do, **:,4,-,€A, Wt d[ao,---,a,-,] if and only if Bt dao, ---,a,_,]. In place 
of 3.1 we have: 


LEMMA 3.1’. U&<(0)B if (*) of 3.1 holds with ‘any @’ replaced by ‘any sub- 
formula @ of 0°. 


3.1’ is proved just as was 3.1; one shows that 21 <(w)% for all subformulas wy 
of 6, by induction on w. (Clearly 3.1’ slightly improves 3.1 even for the elementary 
language.) 

Arguing exactly as in the proof of 3.2 (but using 3.1’) one obtains the downward 
Léwenheim-Skolem theorem for L,,,: 


THEOREM 3.2’. Suppose 8 is an S-structure, « is an infinite cardinal, k S$ B, 
and ® is a set of L,,.(S)-formulas of power at most x. Then there exists a structure 
MW of power k such that U~<(®)B. 


In particular, for a single formula 6 of L,,,,., there exists a countable YW < (6)B. 


1973] SOME ASPECTS OF THE THEORY OF MODELS 13 


Examples of the use of 3.2 and 3.2’ will be given a little later. We first return 
to the elementary language and prove the ‘upward’ Léwenheim-Skolem theorem 
(which is actually due to Tarski). This result, unlike 3.2, will be obtained by means 
of the compactness theorem. 


THEOREM 3.3 (Upward Loéwenheim-Skolem Theorem). Suppose the set X of 
S-sentences has an infinite model 8. Then X has a model in each infinite power 
kK2S. 


Of course an equivalent wording (parallel to 3.2), obtained by taking = to be the 
set of all sentences true in 8, says that in each such power there exists Y= B. 


Proof. Let do, +*:,dz,++» (€ <x) be distinct new constants and let £’ be the set 
of all sentences d, * d, (¢ <1 <x). Clearly every finite subset of 2 UZ’ has a model, 
obtained by adjoining individuals to the given infinite model. (In fact it would 
clearly be enough to know that © has arbitrarily large finite models.) Hence by the 
compactness theorem, 2 UL’ has a model 2. Clearly A =k.Since S < x, we could, 
by 2.3, have taken 2 so that also A < x. (Or, we could take any % and apply 
the Downward Léwenheim-Skolem theorem to obtain another of power x.) 

Observe that 3.3 is important when S is finite, and even in this case the proof 
makes a detour to a language which has x symbols. 

3.2 and 3.3 overlap. For example, they both show that (a) if S is countable and 8 
infinite then there is a countable WU = 8. 3.2, however, gives M2 as a substructure 
of 8, which as we have seen is much stronger. (On the other hand, using the ideas 
of the proofs of 3.3 and 2.1, one can establish (a) without using the axiom of choice, 
which is involved essentially in the proof of 3.2.) 

The Léwenheim-Skolem theorems have many applications. In order to discuss 
some of them, we first introduce some simple and natural terminology. A set T of 
S-sentences is called a theory if for any S-sentence o, if Tho then ceET. Clearly 
a theory T (unlike an arbitrary 2) specifies its S, called S(T). Any class K of S-struc- 
tures determines a theory called Th(K), namely, the set of all S-sentences true in 
all members of K. For example, we can speak of the theory of groups (1.e., Th(K), 
where K is the class of all groups). For single structures, clearly 2 = % if and only 
if Th(M) = Th(B). A set A of S-sentences is called a set of axioms for the theory 
T if T is just the set of all consequences of A. We denote by T + the theory T’ 
axiomatized by T U & and such that S(T’) is S(T) plus the non-logical symbols occur- 
ringin &. A theory T is called complete (a usage quite different from that in ‘complete- 
ness theorem’) if Thas a model and for any S(T)-sentence oc, either coe Tor ~c eT; 
or equivalently, if for some MW, T = Th(). 

A beautiful example of the application of 3.2’ and 3.3 is given by the theory Tp 
of algebraically closed fields of characteristic zero. As is familiar (see, e.g., [16]), 
these are fields 2[ = (A, +,:) in which (1) no 1 # 0 for any positive integer n, 
and (2) every polynomial (of degree = 1) has a root. The field of complex numbers 


14 ROBERT L. VAUGHT [June-July 


is such a field and so is the field of algebraic (complex) numbers. Though (1) and (2) 
sound fairly complicated, it is trivial to see that the class Kg of all algebraically closed 
fields of characteristic zero is elementary in the wider sense (EC,). Indeed, a few 
well-known elementary sentences say that Y& is a field; (1) can be replaced by the 
infinite set of sentences: 


(1’) 1+140,1+1+10,--; 
and (2) can be replaced by all the sentences 
(2’) (Vulo, Uy, °+',U,) [Ug % O > Ax(uyx” +uyx" 1+ +u, = 0)] 
(n = 1,2, 3,---). 


(Here 0 and 1 are elementarily ‘definable’ from + and: and could be eliminated; 
while entries like x*, say for k = 6, should be replaced by xxxxxx.) 

Recall that in a field 2[ an element x is said to be (algebraically) dependent on 
the elements yo,--,¥m—, if some equation u x” + u,x" '+--»+u, = 0 holds in 
which the u;, not all 0, are obtainable as polynomials in yo,-:-, y,,-,; With integral 
coefficients. A subset X of A is called independent if no element of X is dependent 
on finitely many other elements of X . A maximal independent subset of A is called 
a basis for 2{. The following facts (see e.g.,.[16]), go back to Steinitz: (a) Any two 
bases for 2{ have the same cardinal number, called the degree of transcendence of Q. 
(b) For each cardinal x there is (up to isomorphism) exactly one algebraically closed 
field of characteristic zero and transcendence degree x. 

If the transcendence degree of %& is 0,1,---,n,--» or No then clearly W is denu- 
merable (and all these 2{’s are nonisomorphic). On the other hand if the transcendence 
degree of % is an uncountable cardinal x, then already it coincides with A. Hence, 
for each uncountable cardinal x, there is exactly one model of Ty of power x. In 
general, this situation is described by saying that Ty is categorical in the power x. 
Theorem 3.3 has an immediate but remarkable consequence regarding such theories. 


COROLLARY 3.4. If a countable theory T is categorical in some infinite power 
Kk and has no finite models, then T is complete. 


Indeed, any model Y of Tis elementarily equivalent to a model 8 of power x, 
and % is unique (up to isomorphism). 
Apply 3.4 to Ty, we obtain: 


THEOREM 3.5. Any two algebraically closed fields of characteristic zero are 
elementarily equivalent. 


Thus, if one establishes an elementary statement about the complex number 
field by a detour through complex analysis, he can nevertheless be sure that the state- 
ment holds also in the field of algebraic numbers, and indeed, in any algebraically 
closed field of characteristic zero! 


1973] SOME ASPECTS OF THE THEORY OF MODELS 15 


3.5 was first established by another method by Tarski [15]. The above method 
(or others) can be pushed further to yield information about what sets and relations 
are elementarily definable in a model of Tp (cf. [1], [15]). 

Now we consider what can be said about fields Ue Ky in the language L,,,,. 
It is easy to find L,,,,-sentences do, ---,0,, -**,0, Such that o, (or ¢,,) says that the 
transcendence degree is n (or, is infinite). (By 3.5, no such elementary sentences 
can be found.) Verifying this (after rereading the example above of a sentence of 
Lo.o Saying ‘is a torsion group’) will give the reader some good experience with 
| 

On the other hand, by 3.2’, if 8e Ko, B is uncountable, and @¢ is any single 
sentence of L,,,,. holding in %, then there is a countable Me Ky of transcendence 
degree Np such that Q is a model of ¢. (For ® in 3.2’ take {¢, o,, a}, where o is 
the conjunction of all the elementary axioms characterizing Ky.) But there is up to 
isomorphism only one such Y%, so we conclude: 


THEOREM 3.6. Any two algebraically closed fields of characteristic zero having 
infinite transcendence degrees are L,,,,.-equivalent (i.e., have the same true Ly, 
-sentences). 


The fact that nearly anything you could prove about the complex number field 
apparently could always later be proved for all algebraically closed fields of charac- 
teristic zero was observed by algebraists and algebraic geometers (and sometimes 
called ‘‘Lefschetz’ Principle’’). It is interesting that when made precise and proved, 
this conjecture splits into the overlapping statements 3.5 and 3.6. 

We have now completed the mathematical development of this section and are 
ready for the next. But it is impossible not to make several informal remarks about 
some very important topics closely related to the discussion above. 

Firstly: when (as in 3.5) the completeness of a theory T(=T,) is obtained, 
usually the so-called decidability of T is also obtained. Roughly speaking, T (or 
even an arbitrary set X of S-sentences), is called decidable or recursive if a machine 
can be built which can decide, given any S-sentence, whether or not Tk o (or ge). 
(A correct discussion requires the theory of recursive sets and functions.) In general, 
if T has a recursive set of axioms, as is usual in practice, then (the hypothesis of) 3.4 
does yield the decidability of T, but only in a useless, highly theoretical sense. 
The proof of 3.5 by Tarski, however, gives a stronger decidability for Tp, implying 
that a ‘reasonable’ machine might be built. As regards decidability, the best that 
can be said for results like 3.4, is that they may, in some cases, alert us to try to find 
a reasonable decision procedure. 

Secondly: Although 3.3 is only concerned with the existence of structures ele- 
mentarily equivalent to a given %, we used it to show the elementary equivalence 
of given structures 2[ and %, for example, the field of complex numbers and the field 
of all algebraic complex numbers. But theories categorical in power are rare, 


16 ROBERT L. VAUGHT [June-July 


and in general, other methods must be used to prove elementary equivalence (or 
Lo,o7equivalence) of given structures. Examples of such methods are the method 
of eliminating quantifiers (cf. [15]), the method of model-completeness (cf. [12]), 
and the Fraissé-Ehrenfeucht method (cf. [3]). Among the four or five most famous 
results (on completeness and decidability) are Tarski’s [15] concerning the field 
of real numbers and that of Ax and Kochen [1] concerning the p-adic number fields. 
There is a great deal still to be done. For example, it is an open problem (raised 
long ago by Tarski) whether the free group with two generators and the free group 
with three generators are elementarily equivalent! 

Finally, let us look more closely at theories T categorical in some infinite power. 
The theory T, of algebraically closed fields of characteristic zero is categorical in 
each power kK > Np, but not in the power ,. The theory of Abelian groups in which 
every element is of order p (p a fixed prime) is easily seen to be categorical in every 
infinite power (again by a ‘basis’ argument, now a vector space basis with respect 
to the field {0, ---, p—1}). The theory of densely ordered sets (A, <) with no extreme 
elements is, as is well known (cf. §4), categorical in the power N, but no other. 
It was conjectured by Los that these three behaviours are the only possibilities, 
and Morley gave several years later a beautiful proof of this conjecture: 


THEOREM 3.7 [9]. If a countable theory T is categorical in one uncountable 
power, then it is categorical in every uncountable power. 


We do not prove 3.7, and indeed present it partly just as an example of a simple, 
intuitively motivated statement of model theory which requires a deep proof. The 
reader may find the original proof in [9] and an interesting recent proof in a paper 
by Baldwin and Lachlan [18]. (To read these, nothing beyond what is in this 
article is required.) 

There are a number of further conjectures regarding theories categorical in un- 
countable powers which the evidence suggests (in particular, think of JT). With 
a good deal of further effort and ingenuity, several of them have been established. 
Thus, Morley [10] showed that for such a T the number of nonisomorphic denu- 
merable models must be countable. A general result (see 6.7) tells us that (for any 
complete theory) the number cannot be 2, but Morley’s work left open whether 
it might be 3,4,5,--- (as well as 1 and N, as in examples above). Then Baldwin 
and Lachlan (in the aforementioned article) showed that it must be 1 or Np. Another 
problem on which several authors made contributions was that of extending Morley’s 
result 3.7 to arbitrary T (‘‘uncountable power’’ being replaced by ‘‘power > T’’). 
This has only very recently been established in full generality, by S. Shelah (in a 
paper to appear in the Proceedings of the Tarski Symposium held in Berkeley, 
1971). 

Perhaps the most interesting questions still unresolved are those concerning 
the finite axiomatizability of T: (a) if T is categorical in %, and complete (i.e., by 
3.4, has no finite models), can T be finitely axiomatizable? (b) If T (possibly incom- 


1973] SOME ASPECTS OF THE THEORY OF MODELS 17 


plete) is categorical in %, but not in No, can T be finitely axiomatizable? (These 
two problems were formulated by J. Ax as a refinement of one problem in [9], p. 537.) 


4. Saturated models. Before beginning on the main topic of this section, we must 
call attention to a second basic fact about the notion <, namely 4.1 below, which 
is as pervasive in model theory as the downward Lowenheim-Skolem theorem 
(3.2). Let S be a set of relation symbols. Suppose @ is a set of S-structures directed 
by ~<, ie., such that if UBe, then for some CeG,AK<C and B ~<VW. 
Then (as in algebra) we define U(%: We) to be the S-structure B such that 
|B = U( UW]: We) and for each PeS, P® = |) (P* We @). 


THEOREM 4.1. (Tarski) If 8 is the union of a family @ of S-structures directed 
by <, then for each Ue?, UB. 


4.1 is easily proved by induction on the length of the formula @ in the definition 
of ‘2 <B’. (Like 3.2, it can be extended in suitable forms to L,,,,— but we shall 
not need this.) 

Cantor established a famous theorem about the ordering (Q, <) of the rational 
numbers: 

(1) Any two denumerable dense orderings (A, <) and (A’, <’) without end 
points are isomorphic; 

(2) Any countable ordering is isomorphic to a substructure of (Q, <); 

(3) If dg<q,<--- <q, and qg<q;<-:'<q,, then there is an automor- 
phism of (Q, <) taking q; into q; (i Sn). 

Since this whole section will be a generalization of Cantor’s theorem, we briefly 
recall the proof. 

For (1), we first write A = {do,---,a,,---} and A’ = {ao,--+,a,,---}. The de- 
sired isomorphism f will be {(x,,x;,): 7¢€@}, where the (x,,x,) are defined by re- 
cursion on n in such a way that (*)x; < x, if and only if x; < x;,. Suppose we already 
have (x,,x;), for i<n, such that (*) holds (for j,k <n). If n is even, let x, be the 
first unused a; (in the list ao,a,,---). The hypotheses insure that there is an a; 
having exactly the same order-relations with xo, ---,x,—, that x, has with x9, -°:,X,-13 
we take such an a; for x,,. If n is odd, proceed symmetrically; i.e., let x, be the first 
unused a; and choose x, to match its order relations. The f so obtained is clearly 
an isomorphism of Y& onto WY’. The proof of (3) is similar, and that of (2) simpler, 
as the ‘back and forth’ procedure (distinguishing even and odd n) is omitted. 

After seeing the method of Cantor’s proof, anyone is likely to speculate that the 
method can be used in other places. For instance, one could ask for a dense ordering 
in other powers > XN, with properties similar to (1), (2), and (3) (cf. the ,-orders 
of Hausdorff). Generalizing more broadly, we could raise similar questions with 
any other class K of similar structures such as the class of all groups in place of the 
class of orderings. The strongest results of this type were obtained by Jonsson [4]. 
Of course, they require special hypotheses about the class K. However, it turns 


18 ROBERT L. VAUGHT [June-July 


out that if algebraic notions like substructure (in (2) and in Jonsson’s work) are 
replaced by model-theoretic notions like elementary substructure, then some useful 
and quite uniform results of general model theory are obtained. We now establish 
these results (due to Morley and the author [11]). (We give a direct argument, but 
the results can also be inferred from Jonsson’s broader, algebraic results by a suitable 
interpretation.) 

Roughly speaking, the role of the class of orderings is assumed in our work by 
the class of all models of an arbitrary complete theory T having infinite models. 
However, since such a T is determined by any model %& of T, there is sometimes 
no explicit mention of a theory T. 

If Yo B, f: Y A, and (%,y),-y = (Uf(y)),-y, then we say that f embeds 
Y (over %) elementarily in 2{. (In context, ‘over 8’ is usually omitted.) 

The following usage is nearly self-explanatory: A set VY of 1-formulas (of L(S()) 
is satisfiable in © if for some xeC, Gt w(x] for all WeV. W is finitely satisfiable 
in © if each finite subset of Y is satisfiable in ©. 


DEFINITION 4.2. Let MU be a structure of infinite power x. We say that: 

(a) Wis homogeneous if any elementary embedding of a subset of A of power 
<xk into XM can be extended to an automorphism of X. 

(b) Y& is universal (or p-universal) if, for any B = U and any subset Y of B 
of power Sx (or, of power <p), Y can be elementarily embedded in A. 

(c) W is saturated if whenever X is a subset of A of power <x, and ® is any 
set of 1-formulas which is finitely satisfiable in (U,x),-y, then ® is satisfiable 
in (U,x).ex- 


(c) suggests that saturated models might well be called ‘compact’ models and, 
in fact, this name has been used for this notion, but also for some related but not 
equivalent notions. 

(c) can be stated in very slightly different, equivalent form involving the useful 
notion of a type of element. Given a complete S-theory T, choose a new constant c 
(not in S(T)). If Wis a model of Tand ae A, then Th((Y, a)) is a complete S U {c}- 
theory, called the type of a (in QM). The set of all such Th((U,a)), over all models 
YW of Tand elements a of A, is called the set of all types of elements over T. In other 
words, these types are just the complete S U {c}-theories extending T. If {is a model 
of Tand T’ a type of element over T, then in general there may or may not be an 
element a in YW of type T’; if there is, we say that T’ is realized in &. For example 
let 8 be an ordered field elementarily equivalent to the reals 2, but possessing an 
element b, such that for each n, no 1 < b. Then Th((%, b)) is a type over Wp (i.e., 
over Th(YA,)), but is not realized in 2{,. However, a saturated model realizes all 
possible types (hence the name) and this is so even after not too many ‘parameters’ 
are added. Speaking precisely: _ 

4.2(c’) UW is saturated if and only if, for any subset X of A of power <A, 
(YM, x),-x realizes all possible types of elements (over itself). 


1973] SOME ASPECTS OF THE THEORY OF MODELS 19 


4.2(c’) is trivial to verify, using the Compactness Theorem. 

Another simple but useful fact about saturated models, immediate using (c), is: 
(4) If Wis saturated and S’ < S(M) then | S’ is saturated. 

The next theorem, 4.3, is similar to Cantor’s theorems (1), (2), and (3). 


THEOREM 4.3. Let UM=K = No. 

(a) If Wand N' are both saturated and of power k and U= A’ thenUz A’. 

(b) YW is homogeneous if (and only if) whenever X is a subset of A of power 
<x«,f is an elementary embedding of X into XU, and ae A, then f can be extended 
to an elementary embedding of X U {a} into UA. (This condition might be called 
“‘innerly saturated.’’) 

(c) If UW is saturated then UW is homogeneous and universal. 

(d) If Mis homogeneous and w-universal then & is saturated. 


Proof. (a) We imitate exactly Cantor’s proof above of (1). Let A = {az: € < x} 
and A’ = {az:€ <x}. The desired isomorphism f will be {(x,z, xz): € < x}, where 
the pairs (x,,x,) are defined by recursion on € in such a way that (*) (U,x,),<¢ 
= (U’,x,),<<- Suppose we are to define (x;,,x;). Since (*) holds for each € < {, 
it is clear that even when ¢ is a limit ordinal, (QU, x,),<¢ = (W’,x,),<¢. Suppose, 
for example, ¢ is even. Let x, be the first unused az. 4.2(c’) tells us directly that there 
exists ze A’ such that (U, x,),<¢ = (W’, Xj, Z)n<¢; SO we can take such a z for x,’ 

(b) is proved in a completely similar way. As regards (c), if Wis saturated, QW is 
certainly homogeneous as the right-hand condition in (b) is immediate. To prove YW 
is universal (cf. Cantor’s (2)), one proceeds just as in the proof of (a), but with no 
‘back and forth’. 

The argument for (d) is amusing. Call %& A-saturated if (QI, x), . x realizes all types 
of elements whenever |X | <4. The Cantor-type argument for (c) really shows 
that (i) if Wf is A-saturated, then 2 is A+-universal. On the other hand, it is trivial 
that (ii) if 2{ is homogeneous and /-universal then Q is A-saturated, provided 2 is 
infinite and 2 < x. (Indeed, suppose p <A, and T’ is a type over (U,xz)e<,, Say 
T’ = Th(%, yz, b)e<,. By 4-universality, we obtain (U,z.,w)<, = (Bye decy 
By homogeneity, there exists ae A such that (U,x;,a).<, = (U,2,z,w):<,. Clearly 
the type of a in (U,x,)-<, is T’.) 

By (i) and (ii), if Sy S 4 S « and Y is homogeneous, then if YW is A-universal, 
MW is A*-universal. Clearly for a limit cardinal 2, 2 is A-universal if it is y-universal 
for all p< 12. Hence (by induction on the cardinal 4), if 2 is homogeneous and 
w-universal, then is «*-universal. It follows trivially (by (ii)) that such an YW is 
saturated. 

Theorem 4.3 is all very nice but it does not say that there are any saturated struc- 
tures. The next theorem says that they exist very generally provided we assume the 
generalized continuum hypothesis (G.C.H.): 2°* = &,,, for all «. 


THEOREM 4.4. (G.C.H.) Given any complete S-theory T, having infinite models, 


20 ROBERT L. VAUGHT [June-July 


and any uncountable, regular power k > S, there exists a saturated model 
of T of power k (and up to isomorphism just one, by 4.3). 


Proof. We build up the desired 2% by applying the compactness theorem many 
times. By 3.3, there is a model YW, of T of power x. I claim that there exists YW, 
such that : 

(5) U,> A, A, =x, and, for any subset X of Ay of power <x, (U,x),<x 

realizes all possible types of elements. 

Assume for a minute this claim. Then we can also obtain &, which is to MW, just as 
W, was to WZ. Continuing, we obtain Wp, W,,---, We, ---(€ < «), where, for a limit 
ordinal ¢, We = U (U,: 6 < €). The desired Wis VU (W,:¢ <x). Indeed, clearly 
A = x and by 4.1, each W.~<W.If X ¢ A and X <x then, because x is assumed 
regular, X must be a subset of some A;. But any type of element over Th(Y, x), -x 
= Th(U;z, x)... x is realized in UW. , by construction (and hence in YW, since U.4, < A). 
Thus YW is saturated (by 4.2(c’)). 

Now we must create a structure 2[, for which (5) holds. Consider all pairs (X, T’) 
such that X is a subset of Ay of power < x and T’ is a type of element over (Wo, X)x ex: 
We are assuming x is regular, S + %) <x, and G.C.H. Using these, it is routine 
to compute that there are at most x such X’s and, indeed, at most x such pairs 
(X, T’). Suppose we can show that for any one such pair (X,T’) there always 
exists U) such that 

(6) A < UA and T’ is realized in (AY, x), x. Clearly (by 3.2), we can also 
insist that A“? < «x. Then we can form, AO? WM, ..-, WO, --(E <x), one 
dealing with each pair (X,T’), and take their union for the desired Y, . 

Finally, given one pair (X,T’), to obtain (6), put WM = (W,a),-, and 
x = ThA UT’. Obviously any model of © has an isomorph A for which (6) 
holds. To show that & has a model, let @,(c),---,@,(c) (c not in ;) belong to T", 
and write v= 1 \P2/\:** \Gn- Then clearly, Jvop(v,) belongs to Th UW. Hence, for 
some d, Ut w[d] and (Y, a) is a model of Th VL {G(c), ---, @,(c)}. Hence, by com- 
pactness, & has a model, and the proof is complete. 

Incidentally, the little argument just made involving the existential quantifier 
is used many times in model theory. 

We stated 4.4 for arbitrary S(T), but our primary concern is when S(T) is coun- 
table, as we now assume. 4.4 still does not give a saturated model of Tof power Xo, 
and indeed there may not be one. For an example, consider the theory T, of the 
field 2, = (Reals,+,-). In W,, the usual ordering < is elementarily definable 
(x S y# 42(z* = y—x)). Clearly, for each rational g, we can introduce a formula 
(Yo) saying that “q <v,’. For each real x, let 0, = {geQ: Apt ¢,[x]}. Then 
clearly Q,. determines x. Hence, if x, # x,, then Th(U,,x,) # Th(Y,, x). Thus 
T, has 2®° types of elements (even without using parameters). Any countable 
model of 7, can realize only countably many types, so there is no denumerable 
saturated model of T,. (Using the same arguments, one can show there are 


1973] SOME ASPECTS OF THE THEORY OF MODELS 21 


2° non-isomorphic denumerable models of T, .) 

In order to specify just when T has a denumerable saturated model, we need 
an obvious generalization of the notion of type of element over T. If 2 is a model 
of T, and dp, -:-,a,—, € A (and Co, ---, c,— ; are new constants), then Th(Y, ao, ---, a,—1) 
is called a type of n-tuple of elements, or simply n-type, over T. 


THEOREM 4.5. A countable complete theory T, having infinite models, has a 
denumerable saturated model if and only if for each n, T has countably many 
n-ty pes. 


Proof. The necessity is just as in the example of the real field above. For suffi- 
ciency, one simply repeats the proof of 4.4. At the point where the number of pairs 
(x, T’) must be shown to be small enough, the old hypothesis § + X, < x is replaced 
by our new assumption about the number of n-types. 

The field of complex numbers is saturated and so is the denumerable algebraically 
closed field of characteristic zero and transcendence degree \,.. More generally, 
if T is categorical in a non-denumerable power then every non-denumerable model 
of T can be shown without using the G.C.H. to be saturated (this is part of the proof 
of Morley’s Theorem 3.4); and also, T has a denumerable saturated model. The 
complete theory T, of the real field was shown above to have no denumerable satu- 
rated model. Of course, using G.C.H., T; has a saturated model in every regular 
uncountable power X,. What is special here, is that these saturated models of T, 
can be shown to coincide with the fields already studied (by Erdés, Gillman, and 
Hendrickson) under the name ‘n,-real closed fields’. If a theory T’ is categorical 
in the power Xo, then the condition in 4.5 obviously holds, so the unique denumerable 
model of T’ is saturated. (An example is the theory of densely ordered sets without 
extreme points!) 

Some of the remarks just made require definitely nontrivial proofs. By 4.4 and 
4.5 there are a great many saturated models. However, to prove that a given model 
Y is saturated is usually not easy, because (in view of the definition of ‘saturated’) 
it is likely to require much information about what an elementary formula can say 
about an element of YI, or a pair of elements, etc. 


5. Ultraproducts. As is familiar, a family D of subsets of a given set J is called 
an ultrafilter over J if (1) O¢ D, (ii) if Xe D and X ¢ Y then YeD, (iii) if X, Ye D 
then X O YeD, and (iv) if X CJ then XED or I1— XeED. It is easy to show, 
using Zorn’s lemma, that 


(1) if F is a family of subsets of J having the finite intersection 
property (no finite intersection of members of F is empty), 
then for some ultrafilter D, F ¢ D. 


For each ie], {X ¢ I: ie X} is clearly an ultrafilter; ultrafilters of this special form 
are called principal. Obviously, D is non-principal if and only if D contains J—X 


92 ROBERT L. VAUGHT [June-July 


for each finite set X . If I is cofinite the set of all such ‘infinite’ sets obviously has the 
finite intersection property, so, by (1), there is at least one non-principal ultrafilter 
over I. 

An ultrafilter D can also (and suggestively) be viewed as a two-valued finitely 
additive measure yw. (For each X ¢ I, simply put wX = 1 or 0 according as X eD 
or not.) 

In this spirit we say that a property P(i) holds for D-almost all i to mean that 
{iel: P(i)}eD. 

Now suppose that to each ie/ is correlated an S-structure ;, and that D is 
an ultrafilter over J. Let C be the Cartesian product I(A;: ieJ) of the sets A;. If 
f,gEC, we write f~pg if f(i) = g(i) for D-almost all i. ~p is easily seen to be an 
equivalence relation on C. Let f/p denote the equivalence class of f and put 
B = {f/p:feC}. We now form an S-structure 8 having universe B and called 
the ultraproduct T1,(%,: ie 1). If Pe Sand P is n-ary, then (*)P*fo,,, -«-fr-1/, iS to 
hold if and only if P™ f,(i)---f,_ (i) holds for D-almost all i. 

Observe that the ultraproduct operation has been defined purely algebraically — 
with no mention of logic. The central and remarkable fact about ultraproducts is 
that, nevertheless, there is an immediate and strong connection between ultraproducts 
and logic. In fact we can prove that the relationship (*) extends from atomic formulas 
to all formulas of L(S): 


THEOREM 5.1. Basic ultraproduct theorem (LoS). Let D, U,, B and C be as 
above. Then for any S-formula with distinct free variables uo,---,u,-, and any 
u ae | Uu oH 
se hi-1EC, BE (‘° n~t if and only if U,+ (i “| or 
for sfu-1 Vio Sraol oo PEE Fag) F 


D-almost all i. 


It is just an exercise to establish 5.1 by ‘induction on the formula @.’ Notice 
that no theorem of logic (beyond the basic definitions in §1) is required in the 
proof. 

The definition of ultraproduct and Theorem 5.1 have been given assuming S 
contains only relation symbols. However, by 5.1, if P is, e.g., ternary, and each 
P™ is ‘operational’ (i.e., U, + (Vu, v)(4!w)Puvw)) then P®is also operational. Hence 
the definition of ultraproduct can be modified in an obvious way to allow operation 
symbols. Thus, as usual, there is no loss of generality in assuming S has only relation 
symbols. 

Using 5.1, we shall now give a second proof (due to D. Scott and A. Tarski) 
of the compactness theorem 2.1. Suppose that every finite subset s of a given set 2 
of S-sentences has a model, say, %,. Let I be the set of all finite subsets of 2. For 
each geX, let W, = {seI: aes}. Obviously, the set of all such W, has the finite 
intersection property, so by (1) it can be extended to an ultrafilter D over I]. Then 
B = JI,(U,: seJ) is the desired model of £. Indeed, if ge x, then each YW, such 


1973] SOME ASPECTS OF THE THEORY OF MODELS 23 


that oes was taken to be a model of a. Therefore U,+ o for all se W, and so for 
D-almost all s. Hence, by 5.1, Bra. 

Incidentally, the inquisitive reader who tries to compare this beautiful proof 
of 2.1 with that given in §2 will find that the two proofs have more in common 
than first appearances might indicate. For example, looking at 5.1, suppose we ex- 
pand S to S’ by introducing a new constant c, for each fe C. Let I be the set of all 
S'-sentences true in all the structures (WU, f(i));-c, for ieI. Then clearly I has 
the property that was possessed by the set A, in the original proof of 2.1, namely: 
For any 1-formula @ over S’, there exists de S’ such that Juod > P(d)eT. 

There are a number of basic facts about ultraproducts whose proofs (some- 
times using 5.1) will easily be found by the reader. If D is principal, say, D = {X:j ¢ X} 
then II,(W;: ie 1D) = U;. When all YW; are the same structure UW, the ultraproduct 
is written 2} and called an ultrapower. For ae A, let h(a) be the constant func- 
tion on I with value a, taken modD. Then h is an elementary embedding of YW 
into M5. We often ‘identify’ 2{ and its image under h, so that 2 becomes an ele- 
mentary substructure of 5. If W is finite, then M5 = UW (by 5.1). It is known 
that, D being non-principal and Q infinite, 2{} is nearly always a proper extension 
of 2. For example, consider the case J = w. Choose any one-to-one function f 
on I into A. Then for each ac A, {iew:f(i) = a} has at most one element, so 
flip # h(a). Hence f/p is not in (the canonical image of) A. 

This last remark has an application to model theory, of which we will only 
give an example. Suppose Y is a structure of power 2*° Is there a proper elemen- 
tary extension of % of the same power? If S < 2% then (by considering 
Th(U, a)gea VU {C ® Cg: ac A}) we obtain at once from 3.1 and 3.2 an affirmative 
answer. But if S > 2*° that argument fails. However, ultraproducts give an af- 
firmative answer. Take a non-principal D over @. We just saw that YW is a proper 
elementary extension of %. Its power is at most that of the Cartesian power A”, 
which is (2%°)®° = 2"°, as desired. (However, if A has certain cardinal numbers, 
there is no such extension. For details of the known results on this subject due to 
Rabin, Keisler, and others, see [3].) 

It follows at once from 5.1 that 6@ or even &@, properties % are preserved by 
ultraproducts (i.e., if all 2; belong to % so does IIp(Y,: ie I). However, even 
more can be said, because of the simple but important fact that, in defining the 
ultraproduct 8 the universe B depends only on the universes A, and, moreover, 
for each Pe S, P® depends only on the P™‘ (and not on Q™ for other Q&S). Indeed, 
we have at once: F@, properties % are preserved by ultraproducts. (If H = X#' [S, 
expand each QI; to We XH’. Then by 5.1, II,(M;:ieNe ’ and T,(U,: ie ND 
= I)(W;: ie I) [| S). For example, an ultraproduct of non-well-ordered structures 
(A,, <i) is not well-ordered. 

Ultraproducts are a whole subject in themselves. For example, even the list 
of ‘basic facts’ begun above can be continued at some length — before some harder 


24 ROBERT L. VAUGHT [June-July 


results are encountered. It is clear from what has already been said that ultraprod- 
ucts play a role not only in model theory, but also in algebra, and even in analysis 
(where, after all, the ‘almost everywhere’ approach was invented). Moreover, a plain 
set A is of course a structure (with S empty), and when we take ultraproducts 
II,(A;: ie) of sets we encounter the extremely interesting and difficult problem 
in set theory: what can be said about the cardinal number B of the ultraproduct? 
(B clearly depends only on the cardinal numbers of the A, and on D.) Thus ultra- 
products also are part of the subject matter of set theory (and certainly ultrafilters 
are). 

Here we can only make this brief introduction to ultraproducts, and recommend 
the books [2] and [3] for further reading and references. We close by stating without 
proof two important results concerning ultraproducts in model theory. 

A deeper analysis shows that by choosing special kinds of ultrafilters one can 
make the ultraproducts have various special properties. A notable example is the 
following theorem of Keisler (cf. [3]). 


THEOREM 5.2. (G.C.H) For any I of infinite power x, there exists an ultra- 
filter D over I such that, for any S with S < « and any S-structures (U,: ie] 


each of infinite power S «*, the ultraproduct [1,(U,;: ie!) is saturated and of 
power kK. 


Related to 5.2 is the following remarkable result which shows that the notion 
of elementary equivalence can be characterized in a ‘purely mathematical’ way 
(i.e., with no mention of a language). 


THEOREM 5.3. YW and YW’ are elementarily equivalent if and only if they have 
isomorphic ultrapowers. 


Indeed, if we choose, x = S, A; A’, and J = kK, then, by 5.2, there exist D and 
D’ over I such that Yj and W’}. are both saturated and of power x* ; hence, by 
4.3(a) they are isomorphic. 

However, that argument depends on 5.2 which assumes the generalized conti- 
nuum hypothesis. Ten years after Keisler proved 5.2 (and 5.3 using G.C.H.), S. 
Shelah (in an article to appear in Israel J. Math.) has very recently succeeded in 
proving 5.3 outright. 

It follows easily from 5.3 and 5.1 that various other notions such as ‘#% is an 
&@ ,-class’ can also be given purely mathematical definition (cf. [3])—i.e., defi- 
nitions making no mention of a language The problem of finding such definitions 
was initiated and emphasized by Tarski. 


6. The w-completeness theorem. Ultraproducts and saturated models tend to 
be uncountable and, vaguely speaking, to contain many kinds of elements. We 


1973] SOME ASPECTS OF THE THEORY OF MODELS 25 


now study a method of obtaining models which are denumerable and omit certain 
kinds of elements. Throughout this section S is assumed to be countable. 

Suppose S contains a unary relation symbol N, and distinct constants 
Cor C1, °*'s Cy, °** (as well as other symbols). An S-structure Q is called an w-model 
(with respect to N,Co,c,,°°-) if N™ = {cos Creed, For example, let 
W, = (Reals, +, °, w,0,1,---,n,---), where N™° = w and cv =n. If 8B = A, then 
the ‘natural numbers’ N® of 8 may be ‘non-standard’ (meaning in this case that 
8% is non-Archimedean), in which case 8 is not an w-model. 

As we have seen in examples in §2, in general no set of elementary sentences can 
“‘say that N = {co,c,,-:-}"’, so the notion ‘w-model’ is not elementary. Neverthe- 
less, as we will see, under certain conditions we can be sure that T has an w-model. 

Let T be an S-theory such that Tt Ne, for each n. T is called w-complete if 


(1) for any S-sentence Vuoq, if for all 
n, Tk (c,), then TEVu9(Nv9 > P(vo)). 


For example, it is obvious that 


(2) the theory of an w-model Y or of any 
class of w-models is w-complete. 


The basic result called the w-completeness theorem states that if T has a model 
and is w-complete, then T has an w-model. 

As it happens, one can by almost exactly the same proof establish a more general 
statement, having broader applications. (The original w-completeness theorem is 
due to Henkin and Orey; the general statement just evolved.) To state the broader 
form we must first adopt a broader meaning of ‘w-model’ and ‘w-complete.’ Instead 
of N and co, c,,°::, we now are given a fixed list 0), 0,,°-:,6,,-:- of 1-formulas of 
L(S). & is called an w-model (with respect to the list 05, 0,,°--)if A= U (0": néQ). 
T is called w-complete if whenever Tt Vv,(0,(v9) > P(vo)) for all n, then 
Tt Vup¢(v9). The wording of the w-completeness theorem is unaltered. To inter- 
pret the old ideas in the new, let 0, be the formula vg  c, v ~Nvg. Then clearly 
the two notions of w-model coincide, as do the two notions of w-complete. (In the 
new notions we have replaced each individual c, by a set, and then dropped the 
subuniverse N, which is still covered by the device used just above.) 

It is worthwhile noting that in contrapositive form the definition of w-complete 
reads: 


(3) T is w-complete if and only if for any 1-formula @¢, if 
T + Jvg(vo) has a model, then for some n, T + duo (8,(v9) A O(U9)) 
has a model. 


In particular, taking @(v,) to be vg & d, we obtain: 


26 ROBERT L. VAUGHT [June-July 
(4) if Tis w-complete and de S, then for some n, T + 6,(d) 

has a model. 
A rather simple but important fact about w-completeness is in 


THEOREM 6.1. Let o be a sentence involving the symbols in S(T) and perhaps 
new individual constants. If T is w-complete, so is T+o. 


Proof. First assume o has no new constants. Suppose, that for all n, 
T+ at Vvo(6,(vo) > P(vo)). Then, clearly, for all n, 


Tt Vv9(6,(¥9) + (¢ > P(¥9))). 
Since T is w-complete, we infer that Tt Vuo(o —> (v9)), that is, 


T + a+ Vuod(v), 


as desired. 
It is now clearly enough to deal with the case (denoted by T + c) where one new 


constant c and no a is added. Suppose that for alln, T+ct Vv9(6,(v9) > (vo). We 


can suppose @(v,) 1s u(°° c) , where c is not in Ww. Then clearly, for all n, 
0) 


T + Vv9(8,(09) > VoiW(vo, 04). 


Hence, by the w-completeness of T, Tt VuopVu, (v9, 0,), whence 


Vo v . 
T+ckVuo “( ° ') as desired. 
Vo C 


Recall that, given S = S(T), we formed in (2) of §2 the set A, consisting of certain 
sentences o, = du,¢, > (i). when do,:::,d,,:** are new constants (and now 


K = w). By 6.1, each theory T,, = T + {6,-++, 6-1} is @-complete, assuming that 
T is. An infinite extension usually does not preserve w-completeness, but the very 
special extension T’ = T UAg does: 


LEMMA 6.2. If T is w-complete so is T + Ag. 


Proof. The point is that for each m, T’ is a ‘conservative’ extension of T,,,, 
1e., if o is an S(T,,)-sentence and T’ | o then T,, a. Indeed, if T’ t o then by com- 
pactness, T,,,,+ 0 for some k. But any model of T,, can clearly be expanded to 
a model of T,,, (cf. (1) of §2), so Wk o; thus T,,+ ¢, as claimed. Now suppose, 
for all n, T’ + Vuo(6,(v9) > P(vo)). For some m, @ is an S(T,,)-formula, and T’ 
conservatively extends T,,,, so T,, k Vuo(6,(v9) > o(v9)), for alln. Hence, T’ + Vu (v9), 
as desired (since T,, is w-complete and T, < T’). 

Now we are ready for 


1973] SOME ASPECTS OF THE THEORY OF MODELS 27 


THEOREM 6.3. (@-Completeness Theorem). If T has a model and is w-complete 
then T has a (countable) w-model. 


Proof. The notation above is continued. We define ko, k,,---,k,,-+- inductively 
in such a way that for each n, 


(5) x, = T’ + {6,,(do)s eee, O,,,-,(d _1)} has a model. 


It is clear (by (3) of §2) that 2) = T’ has a model. Suppose we have (5) for a given n. 
By 6.1 and 6.2, 2, is @-complete. Hence, by (4), we can find k, such that Z, + {6, (d,)} 
has a model, and the inductive step is complete. 

By the compactness theorem, 2 = \),2, has a model and, indeed, by 2.3 (or 
3.1), & has a model Y in which A = {d,": new}. Since for each n, Ut 0, (d,), 
MW is an w-model. 

The model we obtained was in fact countable, but we could always obtain a 
countable model in 6.3 from an arbitrary one by applying the downward Léwen- 
heim-Skolem Theorem. 

There are two ways in which 6.3 can still be strengthened. First, the formulas 
6,(vo) can be replaced by a list 0,(vo,-:-,v,) of (k + 1)-formulas, for an arbitrary 
fixed k. All definitions and propositions are altered in the obvious way and the 
proofs need only trivial changes. The second improvement allows more than one 
list of 6,’s, and in fact countably many lists, so we are given 07(Uo, ++, Uscny) 
(m,n = 0,1,-:-). UW is called an w-model (over all these lists) if 2f is an w-model 
over each; and T is called w-complete (over all these lists) if Tis w-complete over 
each. The w-completeness theorem in final form reads the same as ever! (6.3 is 
henceforth given this broader interpretation.) The only change needed in the proof 
is in the final proof of 6.3. There are now many more things to accomplish (in place 
of only (5)), but there are still only countably many, so our tasks can be arranged 
in an @-list and then accomplished exactly as before. 

The condition for 2{ to be an w-model with respect to one list of k-formulas 
Oo, °**,0,,5°** is just that 2 have no k-tuple of elements satisfying® = {~0,: neo}; 
or, as we say, that 2 has no k-tuple of elements of the kind ®; or simply that W 
omits ®. Therefore 6.3 is sometimes called the ‘omitting kinds theorem.’ (Actually 
the word ‘type’ is used instead of ‘kind,’ but in §4 we have reserved that for the 
case when (the theory axiomatized by) ® is complete.) 


6.3 and 6.1 together imply a converse of (2): 


(6) If Tis w-complete and % is the class of all w-models of 
T, then T= Th. 


Indeed, if o¢ T, then T + ~@ is w-complete, by 6.1, so has an w-model by 6.3; 
hence ceTh%. 


28 ROBERT L. VAUGHT [June-July 


Speaking informally now, if we are given a set X of sentences, we can consider 
the smallest set &’ > & closed under the axioms and rules of inference of first order 
logic (see the discussion just after 2.3) and also under the (infinitary) ‘‘qw-rule’’: 
if d(c,)€x’ for each n, then Vuo(Nvg > P(v9)) Ex’. (Here we are returning for 
simplicity to the simplest notions of w-model and qw-complete, that is, over 
N,Co,C1,°°*-) Then (6) tells us that 


(7) a sentence o belongs to 2’ if (and only if) o is true in 
all w-models of 2. 


It is in the form (7) that the w-completeness theorem gets its name — as an analogue 
of Gédel’s completeness theorem. The general w-completeness theorem is also, 
in a sense, equivalent to the completeness theorem for L,,,. of Karp [5], a very 
useful completeness result concerning certain (infinitary) axioms and (infinitary) 
rules of inference for the language L,,.. (For one statement on a connection 
between L,,,,. and w-models, see 7.5.) 

There are many interesting applications of the w-completeness theorem. One 
of the nicest will be the topic of §7. Another, which we now discuss, deals with 
denumerable models of complete theories and especially with prime models and N,- 
categoricity. We assume henceforth in this section that T is a complete S-theory 
having infinite models. 

We shall first prove a result about omitting types of elements over (a complete 
theory) T. Let us fix a list of new constants Cp, ---,c,,--- notin S = S(T). As in §4, 
an n-type (over T) is any complete theory T’ over S, = S U {¢,-+*,c,-1}. If T’ 
is finitely axiomatizable over T so that, in fact, for some single n-sentence o, 
T’ = T +0, then we say T’ is principal. Moreover, if @ is an n-formula over S, 
and T + (Co, --:,c,-1) is complete, then @ is called an n-atom (over T). (In fact, 
@ is an atom in the Boolean algebra of n-formulas modulo T-equivalence.) On the 
other hand, ¢ is called atomless (over T) if Tt (Avo, ---,v,-,)@ and for no n-atom 
w do we have Try @. 


THEOREM 6.4. Any complete theory T has a denumerable model X in which 
every finite sequence of elements satisfies either an atom or an atomless formula. 
Moreover, given non-principal k,,-types T,,(m = 0,1,---), such an Y can be found 
which omits every T,,. 


Proof. Consider any fixed m, and let W, (ne€@) be a list of the k,,-formulas 
such that W,(Co, --*,Cm—1) € T,,- Then Tis w-complete with respect to ~Wo, -+°, ~ Was. 
To see this first note that T¥ Vv ow if and only if T+ Jug 2, since T is complete. 
Assume k,, = 1 to simplify notation. Now suppose that for each n, Tk ~wW,(v9) > (U9), 
but Tt Jug ~ d. Then clearly ~@(c)) would be an axiom for T,,, contrary to the 
assumption T,, is non-principal. 

Secondly, let k be fixed, and let ¢,(n € w) be a list of all k-formulas such that ¢, 


1973] SOME ASPECTS OF THE THEORY OF MODELS 29 


is either atomless or an atom. Again, T is w-complete with respect to ¢o,---, d,, °° 
Indeed, suppose @ is a k-formula over S and, for each n, Tk @, > @, but 
Tk (Av 9, +++, %,-1) ~ @. Then for any k-atom yw, T¥ Wy > ~¢;s0 ~ dis atomless. 
Hence Tk ~¢ > g, so TF 9, a contradiction. 

Now we apply the general w-completeness theorem to the ‘w + @’ lists above 
(one for each m, and one for each k), and obtain a denumerable model 2, w-com- 
plete with respect to every list. Clearly X& is just as desired. 


A model & of Tis called prime if Qf is elementarily embeddable in every model 
of T. Wis called atomic if, for any n, every n-tuple of elements of A satisfies an n-atom 
(i.e., for each n, YW omits every non-principal n-type). The theory T will be called 
atomistic if for each n, there are no atomless n-formulas over T. 


THEOREM 6.5. Let I and 8 be models of T. 

(a) If UM is denumerable and atomic then YX is homogeneous. 
(b) If MX and B are denumerable and atomic, then N= B. 
(c) XM is prime if and only if WX is denumerable and atomic. 
(d) T has a prime model if and only if T is atomistic. 


Proof. We shall use 6.4 and also some simple Cantor-type arguments (see (1), 
(2), and (3) of §4), the latter being left to the reader. (a) is trivial using 4.3(b). (b) is 
easy using Cantor’s method. As regards (c), if 2{ is denumerable and atomic, and 
YW = B, then it is easy to embed Y% elementarily in 8, by using Cantor’s method; 
so YW is prime. Now assume YI is prime. By the L6wenheim-Skolem theorem, T has 
a countable model, so clearly 2{ is denumerable. If 2 is not atomic, then, for some 
Qo, °**,4,€A, T' = Th((Y, ao, -*-,a,)) is non-principal. By (a very special case of) 
6.4, Thas a model 8 omitting T’. But then clearly 2{ cannot be elementarily em- 
bedded in 8%, a contradiction. In (d), if Thas a prime, and hence atomic, model, 
it follows trivially that Tis atomistic. Finally, suppose Tis atomistic. By 6.4, there 
is a denumerable model 2 of Tin which every finite sequence of elements satisfies 
an atom or an atomless formula. Since the second possibility can never occur, W 
must be atomic, and hence (by (c)) prime. 

We need below two simple facts about n-types which are really just familiar 
results concerning arbitrary Boolean algebras. 

(8) If there are infinitely many T-inequivalent n-formulas, then there is a non- 

principal n-type over T. 

(9) If there is an atomless n-formula over T, then there are uncountably many 

(in fact 2®°) n-types over T. 


Proof. As regards (8), if there is an atomless m-formula @ over T then any 
m-type containing (Co, °*:,c,—,) is clearly non-principal. If there is not then there 
are obviously infinitely many T-inequivalent n-atoms. Let £ = {~@(Co,-°*+,C,-4): 


30 ROBERT L. VAUGHT [June-July 


@ is an m-atom}. Clearly T U2 is finitely satisfiable, so has a model (YM, ao, «+, a,_ 4). 
Then T’ = Th(Y, ao, --+,a,—,) is non-principal. 

Regarding (9), we only remark that given an atomless @ we can clearly find 
atomless formulas ¢, and @, = @ A ~ @o such that Tt dp — 9; i.e., we can split 
@ in half. Iterating indefinitely, we easily obtain (by another famous argument 
going back to Cantor) the desired 2*° different n-types. 

A direct consequence of (9) is 

(10) If T has a denumerable saturated model, then T has a prime model. 


Indeed, by 4.5, T has only countably many n-types, for each n. Hence Tis atom- 
istic, by (9), and so T has a prime model, by 6.5(d). 


THEOREM 6.6. T is No-categorical if and only if for each n, there are only 
finitely many T-inequivalent n-formulas (or what is the same, there are no non- 
principal n-types). 


Proof. The second condition and the parenthetical condition are equivalent 
by (8) (and its trivial converse). To say there are no non-principal n-types, for any 
n, is just to say that (*) all denumerable models of Tare atomic. (*) implies No-cate- 
goricity by 6.5(b). If T is No-categorical then by (a very special case of) 6.4, T 
clearly cannot have a non-principal n-type. (Or, its only denumerable model clearly 
satisfies the definition of ‘prime’, so by 6.5(c), (*) holds.) 


From 6.4-6.6 together with what we know about denumerable saturated models 
follows a curious result: 


THEOREM 6.7. No complete theory T has exactly two nonisomorphic denu- 
merable models. 


Ehrenfeucht gave examples of complete theories J, having exactly n such models 
(n = 3, 4, 5,---)—cf. [17]. 


Proof. Suppose T has exactly two. Then clearly there are only countably many 
n-types over T, for each n. Hence T has a denumerable saturated model 8 (by 4.5) 
and a prime model YM. 

Since T is not categorical in power Ny, there is (by 6.6) a non-principal n-type 
T’ over T. Of course, T’ is realized in 8, say by (bo,---,b,_1,), but not in W, so 
WM 2% B. Since T has infinitely many inequivalent n-formulas, so obviously has 
T’. Hence (again by 6.6), T’ is not No-categorical. Hence T’ has a denumerable 
model (@, x9, ---,X,- 1) not isomorphic to (%, bo, --:,b,-,). But 8 is homogeneous 
(by 4.3(c)), so clearly © # B. Of course, C H W, as C realizes T’. This is a con- 
tradiction. 


The theory T, of the complex number field has a denumerable saturated model 


1973] SOME ASPECTS OF THE THEORY OF MODELS 31 


and hence (by (10)) a prime model. (In fact the field of complex algebraic numbers 
is the prime model of T, .) However, the theory T, of the real field has no denumerable 
saturated model (cf. §4), but does have a prime model, the field of algebraic real 
numbers (although this will not be proved here). If Xf is any structure which has a 
definable well-ordering (e.g., (@,+,°)), then obviously T= Th&X is atomistic 
and hence has a prime model. This is a very special case, as here the atoms (say, 
l-atoms) ¢ satisfy the additional condition Tt d!v,@(v9) and correspond to ‘de- 
finable elements’. Actually, it takes a little effort to show there is any theory T having 
no prime model; though it may be that in some sense ‘most’ theories T do not have 
prime models. 

6.4-6.7 is the work of Ryll-Nardzewski, Ehrenfeucht, Engeler, Svenonius, and 
the author (cf. [17]. 


7. Two-cardinal theorems. The L6wenheim-Skolem theorems (§3) are about 
the cardinal numbers of the models X& of an arbitrary set & of sentences. Now, in 
fact, if 0 is any fixed 1-formula, then each 2 determines a pair (A, 9: ") of cardinals. 
For example, if the X’s are groups, 0” might be (for each QQ) the center of 2. What 
can be said about the pairs of cardinals (A, 9 #) achieved by models YF of 2? If x 
has a model where the pair is (x, 2), for what (k’, 4’) can we always be sure that Z 
has a model whose pair is (x’,2’)? Answers or partial answers to these questions 
may be called Léwenheim-Skolem theorems for two cardinals, or simply, two- 
cardinal theorems. Incidentally, in most of this work, only infinite 9” are considered, 
as most questions about finite 9” reduce trivially to questions about one cardinal 
or about infinite 9”. 

Again in this section, S is always countable. 0 will always be a 1-formula (over S). 
With 0 fixed, we agree to say that UW is of type (k, 2) to mean that A=xkand@" =A. 

The following two-cardinal theorem was established in [11] by using the homo- 
geneous models of §4. 


(1) If a set & of S-sentences has a model Y of type (x, A), 
where Ny S 1 < xk, then has a model € of type(X,, No). 


Later Keisler [6] gave a different proof, using the w-completeness theorem, and 
greatly strengthened (1), essentially extending it to the language L,,,,, (see 7.6 below). 
This work will be the main topic of this section. We start right off with the principal 
theorem, which is still about the elementary language, but is in such a strong form 
that it will immediately imply the extension of (1) to L,,, 


THEOREM 7.1. Suppose ®%o S Qu <A. Then there exist 8 and © such that 
B< UA, C>+B, o = 6°, B= NX, and C = ,. 


Proof. There is clearly no loss of generality in assuming that S = S(2Q) con- 
tains a unary relation symbol N such that 6” = N". (This is like passing from 
the group (G, 0) to the group (G,o,~*).) Let k = N*. By 3.2, X& has an elementary 


32 ROBERT L. VAUGHT [June-July 


substructure %’ of power x* with the same N (i.e., N* = N”). Hence we may as 
well just assume to begin with that A = K*. 

Let < be a well-ordering of A of order type x* , and let < be a corresponding 
new relation symbol. We introduce the abbreviation Qu@ (read ‘there exist arbit- 
rarily large u such that @’) for the formula Vzdu(z < u A ¢) (where z does not occur 
in $). The cardinal x* is regular (see end §1). Hence the structure (X, <) is clearly 
a model of every sentence of the form: 


(2) (VWo -* W,-1) [Qu du(Nv a Wu, v)) > Ev(Nv a Quy(u, v))]. 


(Here w is any S U{ <}-formula, perhaps involving the w,’s as parameters.) 
By the downward Léwenheim-Skolem theorem, (2, <) has a denumerable ele- 
mentary substructure 8’ = (8, <). 


LEMMA 7.2. Let Xo consist of all the sentences (2), plus a sentence saying 
that < is a simple ordering of the universe with no last element, plus sentences 
saying N has at least n elements (n = 1,2,3,---). Then: 

(a) Every denumerable model 8' of X, has a proper elementary exten- 

sion with the same N. 


Obviously the 8’ constructed above is a model of X,. For later reference, we 
carry out the next steps (Lemma 7.2(a), (b)) in the proof of 7.1 for an arbitrary 
denumerable model 8’ of Xp. 

7.2(a) is the main step in the proof of 7.1. To prove it, first select distinct new 
constants c, for new and d, for be B’. Clearly N® is denumerable, so we write 
N® = {ao, +++, dq s'}. Let B* = (B',a,,b)neo.b2B’ (Where c, denotes a, and d, 
denotes b). Select another new constant e, and let T= ThB*) + {c, < e: be B’}. 
It is easy to see (without using (2)) that if w is any 1-formula over S(:8*) then: 


(3) T + W(e) has a model if and only if 8*+ Quy(u). 


Indeed, if the right side of (3) fails, then for some be B’, B*t (Vvy > d,) ~ W(v9), 
so Tt ~w(e). On the other hand, if 8*t Quy(u), then clearly every finite subset 
of T + y(e) has a model (of the form (B*, b’)); so T + We) has a model. 
Clearly T has a model (e.g., by (3) with u ~ u for w). 
We now consider the notions w-model and w-complete, with respect to N and 
Co, °**,C,,*** (and thus in the simplest form, at the beginning of §6). We will show 
that T is w-complete. Suppose @ is any 2-formula over S(T) not involving e and 


(4) T + dv(Nv a G(e,v)) has a model. 


What we need to show is that for some m, T + (e,c,,) has a model (cf. (1) of §6 
in contrapositive form). Now, by (3) and (4), 


B*t Qusv(Nv a Gu, v)). 


Since 8’ is a model of the sentences (2), we can infer: 


1973] SOME ASPECTS OF THE THEORY OF MODELS 33 


B*t Jo(Nv a Qud(u, v)). 


(p may involve c,’s and d,’s, but parameters were allowed in (2).) Hence, for some 
m, B*+t Qud(u,c,,). Therefore, by (3), T + #(e,c,,) has a model, as desired. 

We have shown that Tis w-complete and has a model so, by the w-completeness 
theorem, Thas an w-model ©. 8* is elementarily embeddable in ©, so we can clearly 
assume that outright B* < ©. Thus © can be written as (B",a,, b, X)new.beB’> Where 
3B’ < 8B". By the axioms of T,x ¢B’, so 8” is a proper extension of 8’. Since € 
is an w-model, N® = {a,:ne@} = N®. Thus 7.2(a) is proved. 


LEMMA 7.2(b). Any denumerable model 8' of Xo has an elementary extension 
©’ of power &, with the same N. 


To prove 7.2(b), define denumerable models 8; of XL) for «<@, by re- 
cursion as follows: 8, = 8B’. 8;,, is a denumerable proper elementary extension 
of B, with the same N, as guaranteed by 7.2(a). B, = U (B;: B <a) if « is a 
limit ordinal; 8% is a model of XY, by the union theorem (4.1). Then clearly 
C’ = U (8): a <@),) is as desired in 7.2(b). 

Just before 7.2(a) we constructed a particular 8B’ = (8, <). Taking C’ = (C, <’) 
as in 7.2(b) for this 8’, it is clear that 8 and € are as demanded in 7.1. Thus 7.1 
is proved. 

From the proof of 7.1 we can also infer a new compactness theorem and a new 
completeness theorem. Again it is convenient to assume outright that S contains 
a special symbol N (instead of dealing with a special formula @) and is just as 
general. (The type (x, A) of 2X refers now to A, N”).) Let < be a symbol not in S, 
and let the set X, be as in 7.2. (Note that Z) depends only on S.) 


LEMMA 7.3. A set X of S-sentences has a model of type (%,, No) if and only 
if 2 UX» has a model. 


Proof. Any model of type (&,, N%,) can trivially be expanded to a model of 
Xo (as we saw at the beginning of the proof of 7.1). Hence from left to right is obvious. 
Now suppose £ UZ, has a model and hence a denumerable model 8’. By 7.2(b), 
38’ has an elementary extension Y’ = (, <) with the same N and of power &,. 
Then YF is a model of 2 of type (1, No). 


THEOREM 7.4. (a) (Compactness): If every finite subset of a set X& of S-sentences 
has a model & of type (8%,, No), then so has ZX. 

(b) (Completeness): An S-sentence o is true in all models of type (&%,, Xo) if 
and only if &)+ o (and hence if and only if o is formally derivable from Xo by 
using the axioms and rules of inference of elementary logic). 


Proof. (a) is immediate from 7.3 and the ordinary compactness Theorem 3.1. 
(b) is really a special case of 7.3. Indeed, 2, + o if and only if {~oa} UZ, has no model 


34 ROBERT L. VAUGHT [June-July 


and hence (by 7.3) if and only if o is true in all models of type (& 1, &_). Of course, 
the parenthetical addition in (b) depends on the completeness theorem, discussed 


after 2.3. 


The two-cardinal theorem (1) is an immediate consequence of 7.1. ((1) and also 
7.4 and 7.3 (for a different Z)) were originally obtained (by the author and Fuhrken) 
by a different method, involving homogeneous models.) Let us form a new language 
L*(S) by adding to L(S) a new quantifier symbol Q* and interpreting formulas 
of the form Q* ud(u) to mean “‘there are uncountably many u such that @(u).’’ By 
means of a method of translation introduced by Fuhrken, one can infer from 7.4(a) 
that the ordinary compactness theorem 2.1, holds for L*(S) as long as S is countable. 
(1) and 7.4(b) also have implications for L*(S). (For all this work, and references, 
see [2].) However, in the completeness theorem for L*(S) so obtained (and in 7.4(b) 
itself), the formal ‘derivations’ of the L*(S)-valid sentences involve extraneous 
symbols (e.g., for the above Xy, the symbol < )—a feature not present in most 
logical systems. Keisler has given for L*(S) a completeness theorem for some simple 
and elegant axioms and rules of inference involving no extraneous symbols. To do 
so, he must discard 7.1-7.4 and begin again, working in L*(S), but some of the 
arguments are quite close to those for 7.1-7.4 in flavor. (For this interesting work 
see [7].) 

Now we shall see the extra strength of 7.1, over what is in (1) and 7.4, by estab- 
lishing in 7.6 that (1) holds (for countable £) in L,,,. 

First we need the following simple lemma relating L,,,,, to the notion w-model. 


LEMMA 7.5. For any L,,,(S)-sentence o, there exists a countable set X of 
elementary sentences, containing new symbols N',do,--:,d,,°°: (and other new 
symbols as well as those of S) such that for any S-structure XW: 

WM is a model of o if and only if NX can be expanded to an w-model of x. 


Proof. Of course, ‘w-model’ is understood with respect to N’,do,--:,d,-°°° 
Clearly we may assume the only logical symbols in o are ~ , 3, and \/ (as well as 
~). Let N’,do,---,d,,--- be new symbols. For each subformula @ of o which is a 
disjunction, introduce a new (n + 1) ary relation symbol P,, where nis the number 
of free variables in @. By recursion we correlate with each subformula @ of o a 
formula ¢*. If @ is atomic, ¢* = @. Also, (~¢)* = ~@* and (Jud)* = Jud*. 
If Pis v,@, and its free variables are uo, ---,u,_,, then * is 3z(N’z A P,(z,Uo,°*', 
u;,—1,)). & consists of (1) all the sentences 


(Wig, ++ U1) (Pols Uo. ***s Un—1) - bn) 


(where ¢ = \/,@, is a subformula of o with the free variables uo,---,u,-,, and 
m€ @), (ii) all the sentences d; # d, and N’d,, and (iii) the sentence o*. The nota- 
tion is complicated, but the idea is very simple, and it is easily checked that & is as 
demanded. 


1973] SOME ASPECTS OF THE THEORY OF MODELS 35 


THEOREM 7.6. If a sentence o of L,,,(S) has a model X of type (x,4) where 
No S4<xk, thena has a model € of type (4, Xo). 


Proof. We can assume that the notion of type (x, 2) is with respect to a fixed 
symbol N (i.e., 2 = N“). Let = be as in 7.5. Then % can be expanded to an w-model 
YW’ (relative to N’,do,:::) of &. Let 8 be Nug \/ N’v9. Obviously 9’ < A’, so we 
can apply 7.1 to YW’ and @. We obtain 8’ and W’ such that B’< QW’, C’ +B’, 
9® — 9© , B’ = Ny, and CC’ = &,. It is easily verified that ©’ is an w-model of Z. 
Hence, by 7.5, € = ©’ | S is a model of c. Clearly € is of type (N,, No), as desired. 


A beautiful application of 7.6, due to Keisler [6], gives a new, short proof of 
a part of Morley’s Theorem 3.7 on theories categorical in power and also strengthens 
that part of Morley’s Theorem. Of course, a class “ of S-structures is called cate- 
gorical in the power x if it has, up to isomorphism, exactly one member of power x. 


COROLLARY 7.7. If a F@;-class X is categorical in power &,, then X is cate- 
gorical in every uncountable power. 


Proof. We are given that # = # [|S where # = Modz and & is a set of 
S,-sentences,S, > S.(The meaning of ‘7@,’ (rather than ‘P@,’) is that S, is assumed 
countable.) Morley’s Theorem 3.8 applied only to &@;-classes, so we are extending 
the ‘N,-case’ of 3.5 to A@;5-classes (which are much broader). 

By (4) of §4, assuming the continuum hypothesis, there is a saturated model W of 2 
of power &,. By (4) of §4, ©] is also saturated. Hence («)there is a saturated 
member of “ of power &,. By using a result in Morley’s work, (*) can be ob- 
tained without the continuum hypothesis, but the proof we give of (*) (and hence 
7.7) will have to depend on the continuum hypothesis. 

Let k > &,. There is a member of % of power x (by a trivial extension of 3.3 
to P@; classes). Since % is categorical in the power &, , all members of “ of power 
x are elementarily equivalent (by an obvious extension of 3.4 to A@;). Hence, if 
all members of % of power x are saturated, then they are all isomorphic, by 4.3(a). 
Now assume % is not categorical in power x. Our argument shows that there is 
a non-saturated We X of power x. We plan to infer, using 7.6, that there is a non- 
saturated member of % of power &,. Since by (*), “ has also a saturated model 
of power N,, this will be a contradiction. 

Since We H, W= BY S for some Be Mod. As A is not saturated, there is 
a subset X of A of power <x and a set ® of 1-formulas which is finitely satisfiable, 
but not satisfiable in (M2, x), .y. For convenience we can of course take X to be in- 
finite. Write c, for the symbol denoting x in (YM, x),. x. We shall adjoin to & countably 
many new relations which express or code information about 9. 

Each S-formula @ is an n-formula for a smallest n = n(@), and corresponding 
to @, we introduce a new relation R, over X (in YW) as follows: 

Rg(X1,°*',X,-1) if and only if the formula (v9, c,,, ---, cx, _,)€ ®. Now consider 


36 ROBERT L. VAUGHT [June-July 


the structure B* = (B, X, Ry)y-- (where F is the set of all S-formulas ¢). Since 
® is finitely satisfiable in (M,x),..-x, we have, for any ¢o,---,@, (writing m, = 


n(9;) - 1) 
(1) Bet [A Ry (Uj,-+5Um,) + Wo A bv, Ui, <5 Unt) 
isk isk 


(where the u'’s are distinct variables). 
Since the whole ® is not finitely satisfiable, we have: 
(2) B*t ~ Aug A (Voy, “+5 Uncgy—1) (Rg (01, “+5 Ong) —1) > P(Uo, 04, “+5 On(py—1)) © 


geF 


Since S is countable, F is countable, so the sentence in (2) is in L,,,,. Let o be the 
conjunction of (1) all the universalizations of the formulas in (1) (over all possible 
Po, °°» Px), (ii) the sentence in (2), and (iii) all the sentences in Th(B*). Again, 
gis in Lay: 

_ By 7.6, there is a model B’*=(%’, X',R})s.7 Of o such that B' = N, and 
X' = Xo. Let UX’ = B’ |S. Clearly W’ is a member of % of power &, . Consider 
(M’, x’), ex, Where, say, d,- denotes x'(x'eX'). Let A be the set of all 
P(V9, dx,,°**, d,-) such that Rx, ---x, (and @eF, m= n(o)—1, x;EX'). A is 
finitely satisfiable in (Q’, x’)... x', because each of the sentences in (1) holds in 8’*. 
On the other hand, since the sentence in (2) holds in 8’*, it is clear that A 
is not satisfiable in (M’,x’),-.x-. Thus (noting that X'< A’) we have shown 
that YW’ is not saturated, and the proof is complete. 


There are more results and some interesting open problems concerning possible 
further extension of Morley’s theorem 3.8 to P@; or even PO(L,,,,,) classes; see [6]. 
Of course, a P@(L,,,.) class is one of the form (Mod o')| S where oa’ is an L,,, 
sentence of type S’ = S. A good example of a class which is (easily seen to be) 
PE (Leo ,o) is the class of all free groups (or free anything). Notice that 7.6 immediately 
implies its own strengthening in which ‘“‘is a model of o’’ is replaced by “‘belongs 
to #”, X being any PC(L,,,,,) class. 

Let us return now to the simplest two-cardinal theorem (1). In 7.6 we have kept 
fixed the special role of N, in (1) while passing to a language richer than the ele- 
mentary. What if we keep the elementary language, but change %,? We shall give 
no proofs but only say something about what is known. By using saturated models 
plus an ingenious device, Chang proved (cf. [2] or [3]): 


THEOREM 7.8 (Chang) (G.C.H.). If a set & of S-sentences has a model of type 
(N,, No) then Z has a model of type (k*,x), provided k is any regular (infinite) 
cardinal. 


The exceptional case when x is not regular remained in total darkness until 
very recently, when Jensen answered it assuming Gédel’s axiom of constructibility. 
That axiom, usually written ‘V = L’, is an axiom much stronger than G.C.H., 
but known to be relatively consistent with the ordinary axioms of set theory. 


1973] SOME ASPECTS OF THE THEORY OF MODELS 37 


THEOREM 7.9 (Jensen). Assuming V = L, 7.8 also applies when x is not regular. 


The status of 7.9 assuming only G.C.H. remains open. Much more is known about 
two-cardinal and related problems. See [3] for some of these results and references 
to work of MacDowell and Specker, Fuhrken, Keisler, Morley, Silver, the author, 
and others. A recent result is [13]. Very recently Jensen, assuming V = L, has ob- 
tained results concerning pairs (k++,k), (k+*+,k), etc. 


There are several books on model theory. Bell and Slomson [2] is a good shorter 
treatment. Chang and Keisler [3] is by far the most comprehensive treatment to 
appear. (Incidentally, most references we have omitted can be found in [3].) A. 
Robinson ({12] and others) is excellent for applications of model theory to algebra. 
Of the many important topics which we have not dealt with, the one which is 
closest to those which we have discussed is the application of the partition theorems 
of Ramsey and Erdés-Rado to model theory — by Ehrenfeucht-Mostowski, Morley, 
and many others. Naturally, a discussion of this topic can be found in [3]. 


Part of the author’s work was supported by National Science Foundation Grant No. NSF- 
GP-8746. 


References 


1. J. Ax and S. Kochen, Diophantine problems over local fields I, II, INI, Amer. J. Math., 87, 
(1965) pp. 605-630, 631-468; Ann. Math., vol. 83, pp. 437-456. 

2. J. Belland A. Slomson, Models and Ultraproducts: an Introduction, North Holland, 
Amsterdam, 1969. 

3. C. C. Chang and H. J. Keisler, Model Theory (to appear). 

4. B. Jonsson, Homogeneous universal relational systems, Math. Scand., 8 (1960) 137-142. 

5. C. Karp, Languages with expressions of infinite length, North Holland, Amsterdam, 1964. 

6. H. J. Keisler, Some model-theoretic results for w-logic, Israel. J. Math., 4(1965) 249-261. 

7. , Logic with the quantifier ‘‘there exist uncountably many”, Ann. of Math. Logic, 
1 (1970) 1-94. 

8. E. G. K. Lopez-Escobar, On defining well-orderings, Fund. Math., 57 (1965) 253-272. 

9. M. Morley, Categoricity in power, Trans. Amer. Math. Soc., 114 (1965) 514-538. 

10. , Countable models of x,-categorical theories, Israel J. Math., 5 (1967) 65-72. 

11. M. Morley and R. Vaught, Homogeneous universal models, Math. Scand., 11 (1962) 37-57. 

12. A. Robinson, Complete Theories, North Holland, Amsterdam, 1956. 

13. J. Schmer] and S. Shelah, On models with orderings, Notices, Amer. Math. Soc., 17 (1970) 
294. 

14, A. Tarski, Some notions and methods on the borderline of algebra and metamathematics, 
Proc. Int. Cong. Math., Providence, 1950, 1952, pp. 705-720. 

15. , A Decision Method for Algebra and Geometry, 2nd ed., University of California 
Press, Berkeley and Los Angeles, 1951. 

16. B. L. van der Waerden, Modern Algebra, vol. 1, Ungar, New York, 1964. 

17. R. Vaught, Denumerable models of complete theories. Infinitistic methods, Proc. Symposium 
in Foundations Math., Warsaw 1959, New York, 1961, pp. 303-321. 

18. J. Baldwin and A. Lachlan, On strongly minimal sets, J. Symb. Logic, 36 (1971) 79-96. 


WHAT IS NONSTANDARD ANALYSIS? 
W. A. J. LUXEMBURG, California Institute of Technology 


1. Introduction. The subject referred to in the title with which we shall deal 
may seem perhaps at first sight to be far removed from the general topic ‘‘The 
Foundations of Mathematics’’ of the Symposium. This relatively new field which 
was created by Abraham Robinson (see [7]) may be looked upon, however, as a 
major contribution to the foundations of analysis. Furthermore, it is another splendid 
example of an application of mathematical logic. 

The development of mathematical analysis by using infinitely small and infinitely 
large numbers has been a subject of constant interest and controversy in the history 
of mathematics. Going back in history we discover that Leibniz was one of the 
strongest advocates of a method involving infinitely small and infinitely large numbers 
in the early stages of the development of the calculus. The reason why the theory 
of infinitesimals gradually fell into disrepute and was replaced later by the ge, 6- 
method must be sought in the fact that neither Leibniz nor his successors were able 
to state with sufficient precision just what rules were supposed to govern their system 
of infinitely large and infinitely small numbers. Although Leibniz stated the principle 
that what holds for the finite numbers should also hold for the numbers in the ex- 
tended system, which includes the infinitely small and infinitely large numbers, it 
is not at all clear in his writings what sort of laws about numbers his principle was 
supposed to apply to. 

It was Abraham Robinson’s recent discovery, mentioned above, that the notions 
of model theory can clarify the notions of infinitely small and infinitely large. Robin- 
son shows that mathematical analysis can be developed by imbedding the real 
number system R in a proper extension *R of R which possesses in a certain sense 
the same properties as R. It is well known that such an extension *R must be non- 
Archimedean and this is the fact that enabled Robinson to define in *R the infinitely 
small and infinitely large numbers whose existence was taken for granted by Leibniz 
and his followers. From the well-known result that there exist systems of axioms 
for the real number system which are categorical, that is, determine the real number 
systems uniquely up to isomorphism it may seem at first very paradoxical that 
such systems *R exist. This sort of paradox has been one of the main sources of the 
condemnation of the theory of infinitesimals and infinitely large numbers as a tool 
in analysis. The paradox vanishes completely, however, if we follow Robinson’s 
idea to restrict the statement ‘‘the same properties’? to a specified collection of 
properties of R which can be formulated in a specified formal language with the 
appropriate interpretation in R as well as in*R, and in which the classical isomorphism 
theorem for the real number system cannot be formulated. Of course it is at this 


38 


WHAT IS NONSTANDARD ANALYSIS? 39 


point that model theory comes into play which by means of the compactness prin- 
ciple guarantees the existence of such systems *R. 

There is, however, another way to establish the existence of *R. This method 
is known as the construction of models in the form of ultraproducts. It has the 
advantage that it can be developed within the framework of axiomatic set theory. 
We shall follow this procedure here. Sections 2, 3 and 4 are entirely devoted to a 
discussion of the existence of *R. In our approach we follow very closely the devel- 
opment as given by Abraham Robinson and Elias Zakon in their paper entitled 
A set-theoretical characterization of enlargements and which appeared in [6]. 
In the remaining six sections it is illustrated by means of examples in which sense 
the theory of infinitely small and infinitely large numbers can be used as a tool in 
analysis. The topics which were selected for this purpose include the theory of limits, 
Euler’s product formula for the sine, and the existence of functions which are not 
measurable in the sense of Lebesgue. 

The ideas of nonstandard analysis were subsequently successfully applied to 
other branches of mathematics. These developments are not taken up here as they 
are beyond the scope of the present introductory paper. But we like to refer the 
interested reader, who for instance would like to know with what great success 
this method was used by A. Robinson and A. Bernstein to solve the invariant sub- 
space problem for a certain class of bounded operators on a Hilbert space, to Rob- 
inson’s book [8] and the papers [1], [2] and [15]. Furthermore, we would like 
to draw the readers’ attention to reference [6] which is the Proceedings of the 
International Symposium on Nonstandard Analysis, which was held at the Cali- 
fornia Institute of Technology in 1967. Its contents, consisting of more than twenty 
papers, gives the latest developments in this field. 

Finally, the author would like to state that the present paper is mainly expository 
in nature. It is particularly directed to those mathematicians who would like to get 
acquainted with this new tool in analysis. We do hope, however, that also the spe- 
cialists in the field will find something new and of interest in this paper. 


2. Definition of the structure R and some of its properties. The earlier version 
of nonstandard analysis (see [7] and [3]) rests on the formulation of the properties 
of R which can be formulated in a first order language, which means briefly that 
quantification in the formal language is permitted only on variables ranging over 
real numbers. One need not go far in analysis, however, to realize the need for a 
richer language in which statements containing expressions such as for example 
‘For all nonempty sets of natural numbers...’’ or ‘‘There exists a continuous 
function...’’ can be formulated. In this connection it is also good to observe that 
even some of the axioms of the real number system are outside the language of the 
lower predicate calculus. For example, Dedekind’s completion axiom involving 
quantification with respect to ordered pairs of sets (Dedekind cuts) is such an axiom. 
In order to cope with this difficulty we shall use the framework of axiomatic set 


40 W.A.J. LUXEMBURG [June-July 


theory in terms of which the theory of real numbers can be developed. The formal 
language will be a lower order language whose constants will range over sets and 
numbers. We shall now present this development here in some detail. We shall 
assume that the reader is familiar with the elements of naive set theory and with 
some of the definitions and results concerning the lower predicate calculus. 

Let R denote as usual the set of real numbers. Then we define inductively the sets 
Ro =R and Ry, = P(Uza0R,) (n = 0,1,2,---), where P(X) denotes the set of 
all subsets of X . The union of all the sets R,, Un2oR, 1s called the superstructure 
on R and will be denoted by R. The elements of R are called the entities of the super- 
structure R. The elements of Rp = R, that is the real numbers, on which the super- 
structure is based are sometimes also referred to as the individuals of R. 

We shall assume that an ordered pair (a, b) is defined in the sense of Kuratowski 
by (a,b) = {{a}, {a,b}} and that n-tuples (a,,-:-,a,) are defined inductively by 
(a) = a, (@,,°°*,a,) = ((@4,°**, 4,—1), 4,). Then it follows immediately that relations 
defined as sets of n-tuples (n = 1,2,---) are all entities of R. Since the algebraic 
operations of R can be defined in terms of three place relations as follows: ab = c 
if and only if (a,b,c)e PE R and a+b = c if and only if (a,b,c)e SER and the 
order relation is a binary relation it follows that the axioms and the properties of 
R can be expressed in terms of certain entities of R. The remaining part of this sec- 
tion will now be devoted to making this more precise. 

The entities of R, — R,-, (n = 1) are called of rank n in R. The individuals 
are given the rank 0. The reader should observe that by means of this definition, 
the empty set gets assigned rank 1. If ae R is not empty, then the rank of a is the 
smallest natural number n such that ae R,. It is also easy to see that if a,,---,a,€ R, 
then rank (a,,-°:,a,) = max(rank a,,---, rank a,) + 2n. 

Some minor set-theoretical properties of R are collected, for later references, 
in the following lemma. 


LemMMA 2.1. @) R, CR, for alln 2 p21. 

Gi) UpeoR, = Ro VR, for all n 21. 

(iii) R,ER,+, for allO Sk Snand foralln20O. 

(iv) If xeyeER, (n 2 1), then xER, VUR,_,. 

(v) If (%1,°°,x,) EYER, (p 2 1), then xy,--+,X,E€ Ro UR,-,. In particular, 
if an entity De R is a binary relation, then its domain, dom® = {x: (Ay) (x, y)€®} 
eR, and its range, ran® = {y: (4x)(x,y)eD}eER. 


Proof. (i) If xe€R,, then xc LJ?=5R,, and so xc Uf_,R, for all g = p —1. 
Hence, xe P(Ui20R,) = Ry+1 for all q+1 2p. 

(ii) For n 21, R,< R,4,, and so since Rg 1s disjoint from all R, (n 2 1) it 
follows that for all n 2 1 we have U;,=-.R, = Ro UR,. 

(iii) Since by (ii) we have that R, CR)» UR, (0S k Sn) we obtain that 
R,€ P(Ro UR,) = Ry - 


1973] WHAT IS NONSTANDARD ANALYSIS? 41 


(iv) If yeR,, (n 21), then ycCR,UR,_,, and so xeéy implies that 
xER,UR,-1. 

(v) If (x1, °,x, EYER, (p21), then (%4,---,x,)€RoUR,-,. Hence, 
{{x1}, {%1,(%2, °° Xa) $} ERo UR,-1 = Ro UP(Ryo UR,-2) implies x,¢€Ry UR,-» 
< Ry UR,-,, and similarly for the entities x2, ---,x,. 

The formal language will now be introduced. 

The atomic symbols of Lare: (i) The connectives A, v, >, <, 4, for ‘‘and’’, 
‘‘or’’, “‘implies’’, “‘if and only if’’, ‘“‘not’’ respectively. (ii) The variables, a countably 
infinite sequence usually denoted by x, y,--- with or without subscripts. (iii) The 
quantifiers (4 -)-existential, and (V -)-universal. (iv) Brackets [ ], used for grouping 
formulas as usual in mathematics. (v) The basic predicate, e read ‘‘member of’’ 
with one open place to the left and to the right of it. (vi) Extra logical constants 
(briefly, constants). This is a set of symbols of which there are enough to be put 
in one-to-one correspondence with the entities of whatever structure may be under 
consideration. This set of constants is usually infinite but fixed. Furthermore, con- 
stants are usually denoted by Roman letters with or without subscripts from the 
beginning of the alphabet, and other symbols such as the numerals 0, 1,2,-:-. 

We shall now assume that the set of constants of Lis brought in one-to-one 
correspondence with all the entities of the structure R and we shall from now on 
identify the constants of L with the entities of R so that R is part of L. If such an 
identification has been established, then we refer to R as an L-structure. 

The interpretation of the basic predicate € of Lin R will be the membership 
relation of axiomatic set theory. 

From the atomic formulas «¢€ 8, where the symbols « and B may denote con- 
stants and variables, the well-formed formulas (wff)are obtained in successive stages 
by. applying the connectives and quantifiers. At the same time brackets are introduced 
in such a way that the formation of the formula can be unambiguously determined. 
More precisely, if V is an atomic formula, then [V] is a wff, if V, Ware wff, then 
[VA W],[VvW],[7V], [V> W], [VW] are wif; and if V is a wff, then [(Vx)V] 
and [(4x)(V)] are wff, where x denotes an arbitrary variable, provided x does not 
already appear in V under the sign of a quantifier. Furthermore, we shall adhere to 
the terminology that in [(Vx)V] and [(Gx)V], V is called the scope of the quantifier 
and in all the wff which can be obtained from these by the further repeated appli- 
cations of connectives and quantifiers. A variable x is called free in a wit V if x is 
not in (4x) or (Vx) or in the scope of a quantifier in V. A wif is called a sentence if 
every variable is in the scope of a quantifier, otherwise it is called a predicate. A wff 
V in Lis said to be in prenex normal form, if in the formation of V from atomic 
formulas the quantifiers are applied after the connectives, that is, if the connectives 
are in the scope of all quantifiers. In symbols, V = (qx,)-+:(qx,)W, where (q :) de- 
notes either (4 -) or (V -) and where W is a wff without quantifiers, is a wff in prenex 
normal form. One of the basic results of the lower predicate calculus states that 
every wif is equivalent to a wif which is in prenex normal form (see [8], p. 10). 


42 W.A.J.. LUXEMBURG [June-July 


For our purpose we shall only consider those wff of L which have the property 
that all quantifiers are of the form ‘‘(Vx)[[xe¢A] > ---]’’ and (4dx)[[xeA]a --]”’ 
where A is an entity of R and which are called the admissible wff. Thus a wff is ad- 
missible whenever the domain of every quantifier occurring in it is a specific entity 
of R. The set of admissible wft of L will be denoted by K = K(L) and the subset of 
K of all admissible sentences which hold in R will be denoted by Ky = K,(L). 

At this point the reader should do well to observe that all statements in analysis 
dealing with numbers, sets of numbers, relations between numbers, relations between 
sets and numbers, and so on, and which hold in R can be expressed as admissible 
sentences of L which are in Ky. For instance, the sentence of Ky 


(Va)(Vb) (Vc) [a, b,c eR] > [P(a, b,c) > P(b, a,c) ] 


expresses that R is commutative (P is the constant denoting the three place relation 
of multiplication). 

Any *L-structure *(R) in which the L-structure R can be properly imbedded 
and for which all admissible sentences of R which hold in R with appropriate inter- 
pretation of the symbols in *(R) also hold in *(R) will be called a higher order 
nonstandard model of R. In that case, it turns out that the set *R of individuals 
of *(R) is a totally ordered field of which R is a proper subfield. But *(R) is not the 
superstructure determined by *R. In fact, if A = P(R) is the constant which denotes 
the entity of R of all subsets of R, then under the imbedding of R in *(R) this constant 
will not denote the set of all subsets of *R as might be expected at first, but only 
a subsystem of the power set of *R, and so on. How this all will come about will 
be explained in detail in the next section. 


3. Models of R that are ultrapowers. We begin by recalling some definitions 
and elementary results from the theory of filters. 

Let J denote a nonempty set. By a filter over J we mean a nonempty set ¥ of 
of subsets of J such that the empty set 6¢%, Y is closed under finite intersections, 
and F < G and Fe¥ implies Ge¥%. In particular % # @ implies that Je %. A 
filter %, is called finer than a filter %, (%, <%,) whenever Fe.%, implies FeS,. 
This relation orders the set of all filters over J and the filter {J} is its smallest element. 
A filter ¥ is called an ultrafilter whenever it is not properly contained in any other 
filter, that is, the ultrafilters are the maximal elements of the ordered set of filters. 
Concerning ultrafilters we have the following important characterization. A filter ¥ 
is an ultrafilter if and only if for every F <I either Fe ¥ or 1 — Fe. The latter 
statement is easily seen to be equivalent to: If 


'S, F,e F(F; c I, i= 1,2,--*,n), 
i=1 


then F, € ¥ for at least one index i, and so, is itself a characterization of the concept 
of an ultrafilter. 


1973] WHAT IS NONSTANDARD ANALYSIS? 43 


A filter #% is called 6-incomplete, whenever there exists a sequence F,€% 
(n = 1,2,---) such that (),.,F,¢é4%, and a filter % is called 6-complete whenever 
it is not d-incomplete. A filter % is called free whenever O (F: Fe ¥) = ©. It is 
not known whether 6-complete free ultrafilters exist. This problem is known as 
Ulam’s measure problem. It is easy to see, however, that a 6-incomplete ultrafilter 
is free. It follows from the following simple result, the proof of which we leave to 
the reader as an exercise. 


An ultrafilter Y is 6-incomplete if and only if there exists a countable partition 
{I,:n = 1,2,---} of the set I over which & is defined such that I, ¢% for alln =1,2:---. 


From this result in conjunction with Zorn’s lemma it follows now also easily 
that on every infinite set there exist plenty of 6-incomplete ultrafilters. For further 
information on filters we refer the reader to the paper of the author: A general 
theory of monads; which appeared in [6]. 

We shall now turn to a description of a structure which is an ultrapower of R. 


Let I be an infinite set, let YW be a 6-incomplete ultrafilter of subsets of I and let 
{I,:n = 1,2,---} be a countable partition of J satisfying I,¢é@% for all n = 1,2,--- 
which will be kept fixed. 


By R! we denote as usual the set of all mappings of J into R. There exists a 
natural imbedding a > *a of R into R’ defined by *a(i) = a for all ie/, that is 
R is identified in R' by the constant mappings. The undefined basic predicates 
="? and “‘e’’ of R can be extended to R’ by means of the following %-dependent 
definitions. 


DzFINITION 3.1. If a,be R’, then a = 4b if and only if {i: a(i) = b(}eE%, 
and aézb if and only if {i: aie b(i)}e@. 

Since it is an immediate consequence of Ie WY that if a,be R, then a = b if 
and only if *a =, *b, and aeéb if and only if *aeé,,*b it follows that the relations 
“=,” and “e,”’ are Y-extensions of ‘‘=”’’ and “‘e’’ of R. For the sake of simplicity 
we shall from now on retain the original notation “‘=’’ for “‘=,”’ and ‘“‘e’’ for 
“Eq”. 

In order to justify the definition we are going to show that for all a,be R’ 
either a = b or not (a = b)(a # b) holds, and aeéb or not (ae b)(a¢b) holds. 
Since the proof for both cases is the same we shall only verify it for “‘=’’. If 
a,b e€ R' , then we set 


U, = {i: a(i) = b(i)} and U, = {i: a(i) # {i}. 


From U, UU, =I1eW@ it follows from the basic property of an ultrafilter 
that either U,e WY and U,¢¥Y or U,¢¥Y and U,€%, that is, by Definition 3.1., 
either a = b or not (a = b) holds. 


44 W.A.J. LUXEMBURG [June-July 


Having justified the definition we can justify further the suggestion that the 
relations ‘‘=”’, “‘e’’ in R? behave like equality and membership of set theory. Since 
the individuals of R are without members but different from @, that is, set theory 
in R is based on a set of so-called urelements, equality of sets in terms of € should 
read ‘“‘a = b”’ if and oaly if aec and bec for allce 2’. But this can now be im- 
mediately verified by observing that if a = b and aeéc, then 


U, = {i: a(i) = b@}$e@ and U, = {i:a(iec(i}ee 


implies by the filter properties that U; NU,€%@%, and so ie U, OU, implies that 
b(i) € c(i), that is, bec. Conversely, we have that if a,be R’, then ae{x:x=a 
and xe R‘} = {a}, implies that be {a}, that is b = a. That the relation of equality, 
as defined in Definition 3.1, 1s an equivalence relation is immediately clear. That 
it satisfies the rule of substitution in €, namely, 


(Va)(Vb)(Vc)(Vd)[[aeb]a [a = cla [b = dl] > [ced] 


can be verified in the same way by using the properties of 7. 

Continuing this process we can show, by using the basic properties of ZY, that 
one by one the statements which hold in R hold in R’ under the defined interpre- 
tation of the basic predicates. We shall of course not follow this procedure but present 
in a general fashion that a certain substructure of R’ has the same properties as R. 

For this purpose we shall assume that the elements of R’ are identified in a 
one-to-one manner with the constants of a formal language *L. Furthermore, *L 
is assumed to have two basic predicates ‘‘=’’ (equality) and ‘“‘e’’ (membership) 
which are identified with the corresponding relations of R’. Thus we obtain an 
*I-structure R’ whose set of true sentences depends on Z. A certain substructure 
of our *L-structure will be singled out which we shall show to satisfy, in a certain 
sense, the sentences of Ky. 

In the following lemma, however, we shall first list for later reference, some of 
the basic properties of the imbedding a > *a of R into R’. 


LEMMA 3.2. (1) *@ = @. 

(ii) If a,beR, then acb implies *a < *b. 

(iii) If a,be R, then aeb if and only if *ae*b. 

(iv) For all ae R we have *{a} = {*a}. 

(v) If ay,+°,4,€R, then *( Ufa14) = Urer*a, “(isn 4) = Nia1 *ap 
*{d1,°°5 An} = {*a1, +++, *ag}, (1, °**, Ay) = (*a1, ++, *a,), and *(a, x se X d,) 
= *a, X + X *a,. 

(vi) For all a,beR we have *(a — b) = *a —*b. 

(vii) If be R is a binary relation, then *(domb) = dom*b, *(ranb) = ran*b, 
and for all ae R we have 


*(b(a)) = *{y: (Ax)(xeaa (x, y)eb)} = *b(*a) = {y: (x) (xe *an (x, y)E*b)}. 


1973] WHAT IS NONSTANDARD ANALYSIS? 45 


Proof. We shall only prove (vi) since the proofs of the other statements are similar. 
For these proofs we refer the reader to the proofs of Theorems 7.1 and 7.7. of [3]. 

(vi) If ce*(a—b), then U, = {i:c(i)ea—bseW implies, using U, < U, 
= {i: c(i)ease@ that ce*a and using U, < U; = {i: c(i)¢ b}e WY which, since 
% is an ultrafilter, is equivalent to {i: c(i)eb}¢W that cé*b, and so ce*a — *b. 
For the converse reverse the steps. 


DEFINITION 3.3. An entity a of the *L-structure R’ is called internal when- 
ever there exists a natural number n 2 0 such that ae*R,. An internal entity a 
is called a standard entity whenever there exists an entity be R such that a = *b. 
All entities which are not internal are called external. 

The set Un>o*R, of all internal entities is called the ultrapower of R with 
respect to the ultrafilter Y and will be denoted by *(R). 


The %-ultrapower of R is usually denoted by Z-prod R but we shall not employ 
this notation in this paper. 

Observe that the mapping a > *a of R into R’ imbeds R into the substructure 
*(R) of R’. 

The notion of rank extends immediately to the internal entities. An internal 
entity ae *(R) is said to be of rank n (n = 1) whenever ae*R, — *R,,,; and the 
entities of *R = *R, are said to be of rank 0. The entities of rank 0 are also referred 
to as the individuals of *(R). Again, by means of this definition the empty set *@ 
has rank 1. The rank of an internal entity can be further specified. If a is non- 
empty and internal, then ae*R, for some p20, and so, by Definition 3.1, we 
have that U = {i: a(i)eR,}eU. Then f= {i: rank s(i) = k} = Ue W implies, 
using the fact that Y is an ultrafilter, that there exists exactly one index n such that 
OSnsp and U, = {i:ranks(i) = n}e®%. Then for all ie U;eW we have 
a(ij)E R, — R,-,, and so ae*(R, — R,_-1) = *R, — *R,-, (Lemma 3.2(vi)), that is, 
rank a =n. 

If a = *b, beR, is a standard entity of *(R), then its rank remains unchanged. 

At this point it seems natural to ask the question whether there are internal 
entities which are not standard. Fortunately, the answer to this question is affirmative 
and as we shall see in the following theorem it is a consequence of the hypothesis 
that the ultrafilter Y is 6-incomplete, a hypothesis which we have not used so far. 


THEOREM 3.5. There exist internal entities which are not standard. In fact, 
if ae R is an entity which has infinitely many elements, then there exists an entity 
be*a such that b is not standard. 


Proof. Since a is an infinite set there exists a sequence {b,:n = 1,2,---} of 
elements of a such that b, # b,, for all n,m = 1,2,--- andn # m. Let b be the mapping 
of J into a such that b(i) = 5, for all ie I, (n = 1,2,---). Then be*a but b is not 
equal to any standard element of *(R), and the proof is complete. 


46 W.A.J. LUXEMBURG [June-July 


The internal entities, defined to be the elements of the special standard sets 
*R,, can also be characterized as follows. An entity a is internal if and only if a 
is an element of a standard entity. In order to see this we need only to show that 
if ae*b, be R, then a is internal. Now from be R it follows that be R, for some n 
which implies that b <c Ry UR,-,, and so, by Lemma 3.2(v), ae *b < *Ry U*R,_, 
implies ae*Ry U*R,_, which shows that a is internal. In view of Theorem 3.5, 
we may ask the question, what about the nature of the entities which are elements 
of internal entities? The answer is that they are internal, as the following theorem 
shows. The converse, however, is not true. In fact, we shall see later in Section 5 
that a set of internal entities need not be internal. 


THEOREM 3.6. If ae be*R, (n 2 1), then ae*R,_,, that is, the elements of 
an internal entity are internal. 


_ Proof. From be *R, it follows that U = {i: b(i) < Ro UR,-1} = {i DER, FEY, 
and so for all ie U we have a(i)E Rp UR,_,. Hence, by Lemma 3.2(v) and Def- 
inition 3.1,aeé*(Ry UR,_,) = *Ro U*R,-_,, and the proof is finished. 


As in the case of the L-structure R we shall call an *L-wff admissible whenever 
all the quantifiers occurring in it are of the form ‘‘(Vx)[[xea]=>---]’ and 
“<(4x) [[x ea] a ---]’’, where a is a constant denoting an entity of R’. 


An admissible wff of *L is called internal whenever all the constants occurring 
in it denote internal entities. An admissible wff of *Lis called standard whenever 
all the constants occurring in it denote standard entities. Thus a standard wff 
is internal. 

The set of all internal sentences of *L will be denoted by *K = *K(*L), and 
the subset of all internal sentences which hold in *(R) will be denoted by *K,y = 
*K(*L). 

If V is an admissible wff of L, then its *-transform *V is defined to be that 
standard wff of *L which is obtained from V by replacing in V all the constants, 
say, @,,°"',a,, occurring in it, by *a,,---,*a, but leaving the variables and bracket- 
ing unchanged. 

We shall now prove that the *-imbedding has the following important property. 


THEOREM 3.7. Let V = V(x,,°::,X,) be an admissible L-wff with the free 
variables x,,+--,x,, an let A = {(x4,°++,X,): (%1,°",X,)€a and V(x,,---,x,)}, 
where a is an arbitrary entity of R. Then Ac R and 


*A = L153 Vp)! Oa "Vp E*a and *V(yis''s Vp} 


Proof. That AeR is trivial. If V = V(X1,°+', Xp, @1,°**,4,) is atomic, that is, 
V has the form (X41, :++,Xp,@1,°',@g—-1)€ 4, OF (X4,°°°,Xp—1,41,°°°,4,) EX, With 
possible permutation of the variables, then the result follows immediately from 


1973] WHAT IS NONSTANDARD ANALYSIS? 47 


Definition 3.1. In order to show that the result holds for all wff V of L without quan- 
tifiers we have to show that if it holds for two such wff V and W, then it also holds 
for [VA W] and [ V]. As is well known this will take care of all the logical con- 
nectives. Assume that *A = {(x,,--:,X,): (%1,°°,X,)€*a and *V(x,,:--,x,)}, then 
we have to show that 


*B = {(%4,°'*, Xp): (X1,°'*,Xp)E*a and 1 *V(X1, 57+, X,)}, 


where B = a—A. Since, by Lemma 3.2(v1), *B = *a —*A the result follows. 
Assume now that V = V(x,,°°*,X), 15°", Yq) and W = W(x,,°-*:,x 
two L-wff without quantifiers for which the result holds, and let 


yp Z19'*'s Zp) be 


A= {(X15°¢'sXps Vass Vago 2199 Zr): (Xp5°6yX po V19 00s VqoZ19 Zr) EA and [Va WI}. 


Then A = {(x1, °°, Z,): (%1,°°°,z,)€@ and V} 0 {(x1,°-+,2Z,): (X1,°"°,Z,)€a@ and W} 
implies, by Lemma 3.2(v), that *A = *{--- }O*{ +» } = {(x4,°++, Z,): (X41, °°, Z,E *a 
and [Va W]}, and so the result holds for all wif without quantifiers. 

For admissible wff with quantifiers we shall use induction on the number n of 
quantifiers. For n = 0 the result was shown above. Assume now that the result 
holds for all admissible wff with less than or equal n quantifiers. Let V be an ad- 
missible wif with (n + 1)-quantifiers which is written in its prenex normal form 
(GXn41) °° (GX) W(X, ++) Xn41. Vts**'s Yq)» Where W has no quantifiers and y,,---, y, 
are the free variables occurring in V. Without loss of generality we may assume that 
(qX,+1) 1s the existential quantifier (3x,.,) otherwise we consider not V. Let b 
denote the domain of (3x,4,). Then since V is admissible, be R. Let 


B= {v1.7 Yp)sXn41)? (1s Vp) Xn41) € a X b and (qx,)---(qx1)W}, 
where ae R. Then, by the induction hypothesis and Lemma 3.2(v), we obtain that 
"B= {100+ Yp)s Xn+ 1)! (vss Yp)s ¥n41) € *@ x *B and (gx,)--: (qx1)*W}. 

The domain of the binary relation B is the set 
A= {(Vi5-s¥p)i V1 yp) Ea and 
(AXn+1) %n41 © 5 A (qx,) +: (x1) W)} 
= {(Vis-+s Vp)! Wiss Vp) E@ and V(y1,---, Yp)}- 
The domain of the binary relation *B is, however, the set 
{150s Vp) Yass Vp) E*a and (AXn41) n+ € *B A (G%q) ++ (GX1)*W)} 
= {Yio Vp)! 157s Vp) € *a and *V}. 


Then, by Lemma 3.2(vii), we obtain the desired result that 


48 W.A.J. LUXEMBURG [June-July 


*A = {Yas ‘ts Vp)? (Vis 9 Vp) € * and *V}, 
and the proof is finished. 


We are now in a position to prove the Fundamental Theorem about ultrapowers 
which we shall refer to throughout the rest of the paper by F.T. 


THEOREM 3.8. *(R) is a higher order nonstandard model of R, that is, an ad- 
missible sentence V of K(L) holds in R if and only if *V holds in *(R), and R is 
properly imbedded in *(R). 


Proof. Theorem 3.5 tells us that the imbedding a > *a of R into *(R) is proper. 
We have to show that if Ve K(L), then Ve K, if and only if *Ve*K,. If V has no 
quantifiers, then it follows immediately from Definition 3.1. Assume that Ve K 
has the prenex normal form V = (qx,)-:-(qx,)W, where W has no quantifiers. 
There is no loss in generality to assume that (qx,) is the existential quantifier (4x,). 
Then Ve K,(L) is equivalent to ‘‘the set A = {x,: x, € a and (qx,-1)---(qx,)W} # ZO, 
where a is the domain of (4x,). Then, by Theorem 3.7 and Lemma 3.2(1), we see 
that A # @ is equivalent to *A = {x,:x,E*a and (qx,-1)-*:(qx1)*W} # *O 
which itself is equivalent to *VE*K,y, and the proof 1s finished. 

An important aspect of the method of nonstandard analysis is to use the F.T. 
repeatedly to transform the true statements of R into true statements about the 
internal entities of *(R). To illustrate this we shall give a number of examples dealing 
with the set theory of R. 


EXAMPLES 3.9. (i). The individuals of R are the ‘‘urelements’’ of the set theory 
of R in the sense that although they are different from the empty set @ there are 
no entities of R which are elements of individuals. This true statement can be ex- 
pressed by the following infinite list of sentences of Ky. 

(Vx) (Vy) [xeR] A Lye R,] > [nyex] ,n= 0, 1, 2, wet 


From the F.T. we conclude that *K, contains the following list of sentences 


(Vx)(Vy)[xe*R] A [ye*R,] = [Tyex], n = 0,1,2,--. 
In words, there are no internal entities which are elements of the individuals 
of *(R). 
(ii) One of the axioms of set theory states that the union of the elements of a 


set is a set. For the set theory of Rthis means that K, contains the following infinite 
list of sentences. 


(vz)[zeR,] > Gy) Lye R,] 4 (Vx) [xe R,] > [vey] 
<> (Ju)[ueR,] A [uez]a [xeu]]. 


Thus from the F.T. we have the following result: The union of the elements 
of an internal entity is an internal entity. 


1973] WHAT IS NONSTANDARD ANALYSIS? 49 


(iii) The power set axiom of set theory states that for every set there exists a 
set whose elements are the subsets of this set. Thus Ky contains the following infinite 
list of sentences. 


(Vx)[xeR,] > (Ay) Ly € Ravi] 4 (Vz) [ze R,] > [[zey] 
<[zcax]], n =1,2,---. 


Then the F.T. implies that the set of all internal entities which are subsets of 
an internal entity is an internal entity. 

(iv) Lemma 2.1(v) states that the domain and range of every entity of R which 
is a binary relation is an entity of R. This again can be expressed by an infinite list 
of sentences of Kg. 


(Vb)[beB,]| > (az)[zeER,] A (Vx) [xeER,] > [[x ez] 
=> (ay) Lye Ra] A [, y) € 5] 


(n = 3,4,---), where B, denotes the entity of all binary relations of rank <n. 
The F.T. then implies that the domain and range of any internal binary relation 
is internal. 

Another remark which is of importance is that if be R is a binary relation, then 
any property which b possesses and which can be expressed by sentences of Ky 
also holds for *b. For instance, if b is an order relation or function or equivalence 
relation, then *b is an order relation or function or equivalence relation. If, however, 
be R wellorders its domain, then *b wellorders its domain in the sense that every 
nonempty internal subset of the domain of *b has a first element. 

(v) From the axioms of set theory it follows that the image of a set under a binary 
relation is a set. Thus in R the following statement holds. If b € R is a binary relation 
and ae R, then {y: (4x)(xea a (x, y)eb)} eR. We leave it now to the reader to 
show that this statement can be expressed by sentences of K,. The F.T. tells us 
that the following results holds. 

The image of an internal entity under an internal binary relation is internal. 

In Theorem 3.6 we have shown that the entities of R’ which are elements of 
an internal entity are internal, and we remarked that a set of internal entities need 
not be internal (see Section 5). One of the problems in nonstandard analysis is to 
decide whether certain sets of internal entities are internal or not. As we shall see 
in the subsequent sections, one of the methods used to decide such a question in- 
volves F.T., by showing that the set in question violates a certain property which 
it should possess, according to the F.T., if it had been internal. Another useful 
and helpful result in this respect is the following theorem. 


THEOREM 3.10. Let V = V(x,,°::,x,) be an internal wff with the free variables 
X1,°°',X,, and let ae *(R) be an internal entity. Then the set {(x1, +++, X,)1(X%15°**sXq) 
Ea and V(x,,°-:,X,)} is internal. 


50 W.A.J. LUXEMBURG [June-July 


Proof. If Vhas no quantifiers, that is, V = V(x,, +++, X,)41,°**, 4p), where a,,-*-,a, 
are the constants occurring in V which by hypothesis, denote internal entitles. Since a 
is internal, it follows immediately that the mapping i > E(i) = {(x,,°-°,x,): 
(X1,°+,X,) €a(i) and V(x, +++, Xq,4,(i), -**,a,(i))} is a mapping of T into R, for 
some n, and so determines an internal entity which we shall denote by E. Then 
it is easy to see that E = {(x,,---,X,): (X1,°°',X,)€a@ and V}. This proves the result 
for internal wff without quantifiers. For general internal wff we shall use again 
induction on the number of quantifiers. Thus assume that the theorem holds for 
all internal wff with <n quantifiers. Let V = (qx,4,)-::(qx,)W be an internal 
wff with the free variables y,, -,y,. There is no loss in generality to assume that 
(4Xn41) = (3X,41) with domain be *(R). Since b is internal it follows from the 
induction hypothesis that the binary relation 


B= Ui Vp »Xn41): (Vis Vp >Xn4i) Ea x b and 
(qXn)°** (QX1)WOV15 +s Ys Xn+ 1D} 


is internal, and so, by Example 3.9(iv), its domain 


{(y1; very Vp): (V1, vey Yp) Ea and (AX n4 1) (qx,,) oe (qx,)W} 
is internal, and the proof is finished. 


4. The nonstandard real number system *R. The set *R of individuals of the 
U-ultrapower *(R) of the superstructure R, where Y is a 6-incomplete ultrafilter, 
has according to the F.T. the same properties as R as far as they can be expressed 
by sentences of Ky. 

Since R is a totally ordered field and since it is easy to see that this can be ex- 
pressed by sentences of Ky it follows that *R is a totally ordered field. The imbedding 
a — *a of R into *R imbeds R into a subfield of *R. In order to simplify our nota- 
tion we shall denote the extensions of the algebraic operation and order when passing 
from R to *R by the same symbols. Thus a+ b =c in *R means in terms of WY 
that {i: a(i) + b(i) = c(i)}}e@%, and similarly for subtraction and multiplication. 
Furthermore, a S b in *R means {i: a(i) S$ b(i)}e %. As an illustration the state- 
ment that the order relation “‘<”’ totally orders R can be expressed by the following 
sentence of Ky 


(Vx)(Vy)[xERA yEeR] >[x<y]v[x=y]v[x>y], 


and so, as already mentioned above it follows from the F.T. that the extension 
of the order relation to *R totally orders R. 

The unit element ee *R has the property that for all 0 # reR, *r(*r)-! =e, 
and so e = *1, where 1 denotes the real number one. 

The reader will appreciate that we shall simplify our notation further by no 
longer using the *-notation to denote the standard individuals of *R. Thus we 


1973] WHAT IS NONSTANDARD ANALYSIS ? 51 


shall from now on identify R with the subfield of the standard numbers of *R, and 
we shall feel free to write R c *R. 

The absolute value | r | of a real number reR defined by | r | = r whenever 
r > 0 and | r | = —r whenever r < 0 can be considered to be a mapping of R into 
R* = {r:reéR and r 2 0} the set of all nonnegative real numbers. The constant 
of L denoting this mapping extends by passing from R to *(R) to a mapping *| | 
of *R into *(R*) which according to the F.T. has the property that | a| =a 
for all *Raa 2 0 and *|a| = —a for all *R9a <0. Also in this case we shall 
drop the *-notation and write | a | to denote the absolute value of a real number 
aeé*R. Similarly, we shall write max(a, b) and min(a, b),a,b é*R, for the extensions 
*max(,) and *min(,) of the mappings max(r,s) and min(r,s) of R x R into R 
respectively. 

This liberalization of the notation and some additional notation later on will 
help a great-deal to simplify the mechanics of the subject and can hardly be expected 
to cause confusion. 

Let the constant S denote a subset of R. Then on passing to *(R), *S denotes 
a subset of *R which is a standard entity and which by the F.T. has the same prop- 
erties as S as far as they can be expressed by sentences of K,. More precisely 
the substructure *(S) of *(R), where S denotes the superstructure defined by S, 
is an ultrapower nonstandard model of S. On the basis of Lemma 3.2(iii) and the 
present notation, we feel free to write S<*S. Furthermore, by Lemma 3.2(v), 
S = *S if and only if S is a finite set. 

If the constant N denotes the set of natural numbers of R, that is, N = {1,2,---}, 
then the standard entity *N denotes a set of numbers of *R which again has the 
same properties as N as far as they can be expressed by sentences of Ky. More 
precisely, *(N7) is an ultrapower higher order nonstandard model of arithmetic. 

From Theorem 3.5 it follows that *R is a proper extension of R, and so, accord- 
ing to a result from algebra to the effect that every Archimedean field is isomorphic 
to a subfield of R, we conclude that *R is non-Archimedean. But *R has the same 
properties as R and R is Archimedean. Let us now examine this apparent paradox. 
The fact that R is Archimedean can be expressed by the following sentence of Ky: 


(Vx)[xeR] > (Vn)[neN] > [[nx < 1] [x S 0]], 
and so, by the F.T., the following statement holds for *R. 
(Vx) [xe*R] > (Vn)[ne*N] = [[nx Ss 1] =[x < 0]], 


that is, with the proper interpretation of the constants, *R is Archimedean with 
respect to *N. It is not Archimedean in the sense of the metalanguage, that is, if 
0 <ae*R, then there exists a natural number n in the metalanguage such that 
a+-++a>I1, n-times +. 

Up till now we have only considered some properties of R and their extensions 
which can be formulated in a lower order language, that is, sentences in which 


52 W.A.J. LUXEMBURG [June-July 


quantification is over numbers only. Let us now examine a few of the higher order 
type properties of R. One of the important higher order properties which R possesses 
and which we have already referred to in the beginning of Section 3 is the so-called 
Dedekind completeness property of R which states that every nonempty subset 
of R which is bounded above has a least upper bound. This statement about R can 
easily be expressed by a sentence of Ky, which will contain a universal quantifier 
ranging over subsets of R. Then it follows from the F.T. that *R satisfies a Dedekind 
completeness property of the following kind. 


(4.1) Every nonempty internal subset of *R which is bounded above has a least 
upper bound. 


Since *(N) is a higher order nonstandard model of arithmetic, it follows that 
under the appropriate interpretation of the F.T. the model *(V) satisfies all the 
axioms of Peano. For instance, the principle of induction stating that every nonempty 
set of natural numbers has a first element, being a higher order property of NV, has 
to be interpreted in *(1) in the following sense. 


(4.2) Every nonempty internal subset of *N has a first element. 


From Theorem 3.5 it also follows that *N — N # @. More precisely, we shall 
now show that there exists a natural number weE*N such that | r | <a for all reR. 
Indeed, if w(i) = n for all ieJ, (n = 1,2,---), where {J,} denotes the partition 
of J such that J, ¢@ for all n = 1,2,---, then @ is a mapping of J into N with the 
property that for all0<reN the set {i:w() <r} €@%, and sowe*N and | r | <@ 
for all re R. This proves on the basis that Y is 6-incomplete that *N contains a 
number which is larger than any positive real number, that is a number which could 
be called infinitely large. The reader will find it easy now to appreciate the following 
definition and facts about *R. 


DEFINITION 4.3. A real number aeé*R is called finite whenever there exists a 
standard real number 0<reR such that |a| <r.A real number ae*R which 
is not finite will be called infinite. 

A real number aé*R is called an infinitesimal or infinitely small whenever 
| a| <r for allO<reR. 


The set of all finite real numbers of *R will be denoted by M, and the set of all 
infinitesimals by M,. 

Observe that R< My, M, < M, and ROM, = {0}, that is, 0 (‘‘null’’) being 
regarded also as an infinitesimal is the only standard infinitesimal. 

A real number aeé”R is in‘nite if and only if | a| >r for all O<reR. Thus 
the natural number @ defined above is infinite. Its reciprocal, however, is an infini- 
tesimal. More generally, a real number 0 4 aeé*R is an infinitesimal if and only 
if its reciprocal 1/a is infinite. 

The finite natural numbers are determined in the following theorem. 


1973] WHAT IS NONSTANDARD ANALYSIS? 53 


THEOREM 4.4. A natural number ne*N is finite if and only if nis a standard 
natural number. In symbols, *NOM,) =N. 


Proof. It is obvious that N c M,. If ne*N is finite, then there exists a standard 
real number 0 <reR such that n<r. However, Kg contains the sentence 


(Vx)[xeN] >|[x Sr] e[x =1])v [x =2] v--- v[x = pl, 


where r and p are constants and p = [r] is the integral part of r. Thus by the F.T. 
we obtain that n = 1 or n = 2 or --- or n = [r], and the proof is complete. 

From Theorem 4.4 it follows that the set of all infinitely large natural numbers 
is given by *N — N. It is not uncustomary to denote infinitely large natural numbers 
by lower case greek letters, such as w, with or without subscripts. 

The mapping r > [r] of R, into the set N U {0}, where [r] denotes the largest 
nonnegative integer less than or equal to r, extends on passing from R to *(R) to 
a mapping *[-] of *(R*) into *N U {0}. From the F.T. it follows that for all 
0 S ae*R, *[a] is the largest nonnegative integer S a. Also in this case we shall 
drop the *-notation and simply write [a] for the integral part of a. 

We shall now turn to a discussion of the properties of the finite numbers of *R. 

It is easy to see that M, is a subring of *R, and in fact is an integral domain, 
that is, M, has no divisors of zero. The set of infinitesimals constitutes a subring 
of M, with the property that if he M, and ae Mb), then aheM,, that is M, is 
an ideal in M,. In fact, it is easy tosee that M, isa maximal ideal. Indeed, observe 
that if ae My and a¢M,, then there exist positive real numbers r,, r,¢éR such 
that O<r,< |a| <r,, and so 1/aeé My shows that any ideal which properly con- 
tains M, must contain the unit element 1 of My and so is all of My. 

If a,be*R and a — b is infinitesimal, then we shall say that b is infinitely close 
to a and we write a = ,b. 

Consider the quotient ring M,/M,. Then since M, is a maximal ideal in My, 
the quotient ring M,/M, is a field. We claim it is isomorphic to the field of standard 
real numbers. The precise result and details are the subject of the following important 
theorem. 


THEOREM 4.5. The quotient ring M,/M, is order isomorphic to the field R 
of the standard real numbers. 


Proof. First observe that if A is an equivalence class in My modulo M,, then A 
cannot contain two different standard real numbers r, and r,. Indeed, in that case 
[ry —r| = ,0, and sor, # r, implies by Definition 4.4 that [71-12 | < | ri—r | 
and a contradiction is obtained. This shows that R is a subfield of My/M,. To 
complete the proof we have to show that to every ae M, there corresponds a stan- 
dard real number r, which is then unique, such that a—r =, 0. To this end, 
observe that if ae My, then the sets D = {r:reéR and r S a} and D’=R-D 
define a Dedekind cut (D, D’) in R. Let re R be the real number in R which deter- 


54 W.A.J.. LUXEMBURG [June-July 


mines the same cut (D, D’). Then we shall show that a =,r. If not, then by Definition 
.4.4 there exists a positive real number 0<¢eeR such that | a — r| ze.lfa>r, 
then | a—r| 2 « implies that r + ¢/2 <a, and contradicts the fact that a and r de- 
termine the same cut. Similarly, if r>a, then r —¢/2> a gives rise to the same 
contradiction. Thus M,/M, is order isomorphic to R and the proof is finished. 

The unique ring and order isomorphism of My onto R with kernel M, plays a 
very important role in the theory of infinitely small and infinitely large numbers. 
We shall firmly establish it in the following definition. 


DEFINITION 4.6. The ring and order homomorphism of My onto Ro with kernel 
M, will be called the standard part homomorphism and will be denoted by st. 


In the next theorem, we shall summarize the basic properties of the homomor- 
phism st for later reference. 


THEOREM 4.7. (i) st(a + b) = st(a) + st(b), st(ab) = st(a)st(b) and st(a — 5) 
= st(a) — st(b) for all a,be Mo. 

(ii) If a,beE My, then a S b implies st(a) S stb). 

(iii) st(|a|) =| st(a)|, st(max(a, b)) = max(st(a),st(b)) and st(min(a, b)) = 
min(st(a),st(b)) for all a,be Mo. 

(iv) st(a) = O if and only if aeM,. 

(v) For all standard re R we have str) = r. 

(vi) If ae M, and st(a) 2 0, then |a| =, St(a). 

(vii) For all a,b€ Moy we have a =, b if and only if st(a) = st(b). 


It is now customary to call the equivalence classes of My with respect to M, 
the monads. of the standard numbers determined by them. The monads are denoted 
by u(r), réeR. Thus, in particular, u(0) = M,. 

We shall conclude this section with a number of remarks which are of interest 
in themselves. 


REMARKS. (i) (The standard part operation defined as a limit). The standard 
part operation ‘‘st’’ can also be defined as follows. If ae€ My, then, by Definition 4.4 
and Definition 3.1, there is a set UeW and a positive standard real number 
0O<reéR such that ie U implies | a(i) | <r. Hence, the image of the ultrafilter 7 
under the mapping i — a(i) of J into R 1s a basis of a bounded ultrafilter of subsets 
of R, and so, by the local compactness of R, it converges to a unique real number r. 
A simple observation shows that r = st(a). Thus, st(a) = lim,a for all ae Mg. 

(ii) (A nonstandard construction of the real number systems). The proof of 
Theorem 4.5 suggests immediately the following alternative construction of the 
real number system. Let the constant Q of L denote the field of rational numbers. 
Then *(Q) is a higher order nonstandard model of the superstructure Q. Thus the 
set of individuals *Q < *R is a subfield of *R which has the same properties as Q 
as far as they can be expressed by sentences of Ky. From Theorem 3.5 we know that 


1973] WHAT IS NONSTANDARD ANALYSIS? 55 


*QO # Q, and in fact *Q contains an element which is larger than any standard real 
number. It is an easy and interesting exercise for the reader to transform the prop- 
erties of Q to *Q. We shall show here only that *Q can be used to define the real 
number system. To this end, we single out the rationals of *Q which are finite, that 
is, ge*OQ is finite whenever |@| < some positive standard rational number. The 
set of all finite rationals will be denoted by Q,. Observe that 07 = *QMM,.A 
rational q € *Q is called infinitesimal whenever | q | is smaller than all positive standard 
rationals. The set of all infinitely small rationals will be denoted by Q,. Thus 
0, = *QOM,. Then it is easy to see that Q, is a maximal ideal in the integral 
domain Q,. Thus the quotient ring Q,/Q, is ring and order isomorphic to a field. 
The proof of Theorem 4.5 shows us, however, that this field is isomorphic to the 
field of Dedekind cuts of Q, and so, by definition, Q,)/Q, is isomorphic to the real 
number system. 

(iii) (The nonstandard complex number system). Within the framework of 
axiomatic set theory the complex number system C may be regarded as a subtheory 


of the theory of the superstructure R x R determined by Rx R. The algebraic 
operations of addition and multiplication are denoted by constants which correspond 


to certain six-place relations; and so *(R x R) may be looked upon. as a higher order 
non-standard model of the complex number system. 

It is advisable also in this case to employ the familiar notation z = x + iy for 
complex numbers, where now x,yeé*R and i? = —1. The set *C = *R x *R of 
the extended complex number system has of course the same properties as C, and 
so is, in particular, a field. If z; = x, +iy, and z. = x, + iy,, then also in *C 
we have 


Zy + Zz =X, +X2+ WY, + yz) and 2422 = (X1X2 — WyV2) + U(X y2 + X2y1)- 


Furthermore, z = x + iy, then x is called the real part of z and y is called the imag- 
inary part of z. A complex number z = x + iy is finite whenever x and y are 
finite, otherwise it is infinite. If x and y are both infinitely small, then z = x + iy 
is called an infinitely small complex number. From Theorem 4.6 it follows that every 
finite complex number is infinitely close to a unique standard complex number. 
For further details concerning nonstandard complex function theory we refer the 
reader to [8] and [10]. 


5. Definitions and properties of some external entities. We pointed out that the 
converse of Theorem 3.6 need not hold, that is, a set of internal entities need not 
be internal. In the preceding section we introduced a number of sets of individuals, 
namely, the set of all infinitely large natural numbers *N — N, the set of finite 
numbers M,, the set of infinitesimals M,, and the monads p(r), re R. It is now 
natural to ask the question whether these sets are internal or not? To decide this 
we shall use the following procedure. We assume the set in question is internal and 


56 W.A.J.. LUXEMBURG [June-July 


then show that it violates a property which it should have possessed on the basis 
of the assumption that it is internal and the F.T. The details are contained in the 
following theorem. 


THEOREM 5.1. The nonempty sets *N—N, M,y,M,, u(r)(reR), and the set 
of infinitely large real numbers *R,, = *R — Mg are all external. 


Proof. Assume that *N — N is internal. Then since *N — N # @ (Theorem 3.5) 
we have by (4.2) that *N — N has a first element, say, w,. But the set of infinitely 
large natural numbers does not have a first element. Indeed, if ae *N — N, then 
k+1<o for all ke N implies thatwm — 1e*N — N,and so w )—1 < Wo shows that 
*N — N has no first element. Thus *N — N is external. 

Assume that the set M, is internal. Since M, # @ and he M, implies | A | <1 
it follows from (4.1) that M, has a least upperbound, say, ag. From 0eM, it fol- 
lows that ag 2 0. Furthermore, ag # M, since M, contains elements other than 0. 
But then a,/2 is also a least upper bound of M, and a contradiction is obtained, 
and so M, is external. 

Similarly on the basis of (4.1) we can show that Mg is external. We leave it to 
the reader as an exercise. 

If *R,, = *R — Mog is internal, then also My = *R—*R,, is internal, and a 
contradiction is obtained. Thus *R,, is external. 

Since the translation mappings of *R are internal (check this) it follows imme- 
diately from (0) is external that u(r) = u(0) + r(re R) is external. This completes 
the proof. 


REMARKS. (i) If Dc M, is internal and nonempty, then according to (4.1) it 
has a least upper bound. The above proof shows that this least upper bound is an 
infinitesimal. Similarly, the least upper bound of a nonempty internal set of finite 
numbers is finite. The greatest lower bound of a nonempty internal set of infinite 
numbers is of course infinite. 

(ii) The standard part operation is a mapping of M, onto R. It, is, however, 
not an internal mapping. Indeed, if it were internal, then according to Example 
3.9(v) its domain My would have to be internal which contradicts the preceding 
theorem, and we conclude that the standard part operation is an external operation. 

Let A ER be infinite. Then according to Theorem 3.5 the set of all the nonstandard 
entities of *A is not empty. More precisely, we have the following result. 


THEOREM 5.2. If Ae R, then the set *A — {*a: ae A} of all the nonstandard 
elements of *A is either empty or external, and in the latter case the set {*a: ae A} 
is also external. 


Proof. If Ae R, then *A — {*a:ae A} = @ if and only if A is finite. (Theorem 
3.5 and Lemma 3.2(v)). Assume therefore that A is infinite. Then there is a one-to-one 
mapping f of a subset of A onto the set N = {1,2,---}. If B = *A — {*a: ae A} is 


1973] WHAT IS NONSTANDARD ANALYSIS? 57 


internal, then BOdom(*f) is internal also (Theorem 3.10). Hence, by Example 
3.9(v), we have that *N — N = *f(B A dom *f) is internal which contradicts Theorem 
5.1 and the proof is finished. 

Although the preceding theorem shows that the set of nonstandard elements 
of the extension of an infinite set of R is external there are plenty of internal sets 
whose elements are all internal entities which are not standard. Indeed, ifwe*N —N, 
then the set {q} is internal but its element is not a standard entity. More generally 
any finite set of internal entities which are not standard is internal. This statement 
can be generalized as follows. We begin with a definition. 


DzFINITION 5.3. A set D of internal entities of *(R) is called *-finite whenever 
there exists a natural number we€*N —N and an internal one-to-one mapping 
of D onto the internal set {1,2,---,@}. In that case, we shall say that the internal 
cardinal of D is w or shortly that D has w-elements. 


If D is *-finite, then it is clear that its external cardinal is at least as big as No. 
Concerning *-finite sets we have the following result. 


THEOREM 5.4. Every *-finite set of internal entities is internal. A *-finite set 
of real numbers has a largest and a smallest element. 


Proof. Since, by Example 3.9(iv), the domain of an internal function is internal 
it follows immediately from Definition 5.3 that a *-finite set is internal. 

If D is a *-finite set of real numbers, then from the sentence of Kg stating that 
every finite set of real numbers of R has a largest and a smallest element it follows 
from the F.T. that every *-finite set of real numbers in *R has a largest and a smallest 
element. This completes the proof. 


REMARK. If the internal set D is *-finite, then it must contain at least one internal 
entity which is not standard, and so at least externally infinitely many of those. 
This can be shown as follows. If the entities of D are all standard, then there exists 
a standard set Ae R such that D = {*a: ae A} (use first part of Theorem 3.6). Since 
the cardinal of A is infinite it follows from Theorem 5.2 that the set D = {*a: ae A} 
is external, and so a contradiction is obtained. 


6. The theory of limits. As a first example and also for later reference we shall 
illustrate what kind of effect the theory of infinitely small and infinitely large numbers 
has on the theory of limits. 

We recall that a (standard) sequence {s,:n = 1,2,---} can be regarded as a 
mapping of N into R, and so being a subset of N x R it is an entity of R which we 
shall denote for obvious reasons by s. On passing from R to *(R) the entity s extends 
to an entity *s which according to the F.T. and Lemma 3.2(vii) is a mapping of 
*N into *R. Furthermore, for all finite ne N we have *s, = s, as follows from the 
fact that *(rans) = ran*s and the convention of dropping the *-notation for indi- 


58 W.A.J. LUXEMBURG (June-July 


viduals. The standard sequence *s in *(R) has the same properties as the sequence s 
as far as they can be expressed by sentences of Kg. With this fundamental principle 
in mind we shall now prove the following theoreins. 


THEOREM 6.1. A sequence {s,:n = 1,2,---} in R is bounded if and only if 
*s is finite for all infinitely large natural numbers we*N —N. 


Proof. This follows immediately from the remark following Theorem 5.1 to 
the eftect that the least upper bound of an internal set of finite numbers is finite. 
Hence, if (ran*s) < Mg, then |*s,| S a for all ne*N and some ae Mg, that is, 
|s,| < st(a) for all ne N, and the proof is finished. 

In the classical sense a sequence {s,:n = 1,2,---} is said to be convergent with 
limit s if and only if 


(*) (Ve)[0 <ceR] > (ax)[xEN] A (Vy) [yEN Ax Sy] = [ls,—- s|<e]. 
In nonstandard analysis this is expressed in a more intuitive fashion as follows. 


THEOREM 6.2. Let {s,:n = 1,2,---$ be a sequence of numbers of R, and let 
seR. Then lim,_,,.s, = S if and only if *s, =,s for allwe*N—N. 


Proof. Assume first that lim, _,.,s, = s. Then from the sentence (*) of Ky the 
following is a sentence of Kg. 


(Vx)[xeN Ax>n|> |s, — | <ée, where e>0 and neN 
are constants. Thus the following *L-sentence holds. 
(Vx)[xEe*N Ax>n] > |*s,—s| <é. 


In particular, for all oe *N — N we have that |*s, —s|<e. The latter statement 
holds, however, for all e > 0, that is, *s, =,s for allmae*N —N. 

In order to see that the condition is sufficient we observe that if ¢ is a constant 
denoting a positive number of R, the following sentence holds in *(R). 


(Jy) [ye*N] a (Vx) [xE*N A y<x] > |*s, — s| <6. 


Indeed, we need to take for y only an infinitely large natural number. Observe 
now that this sentence is the *-transform of the sentence 


(Ay) Lye N] v (Vx) [xEN A y<x] > Is, —s| <é, 


and so by F.T. holds in R. This means that there is an index nyéN such that 
|s, — 8| <e for all n > ng. Since this holds for all ¢ > 0 we obtain that lim,..5, = 
and the proof is finished. 

The condition *s, =,s for all ae *N —N is equivalent to st(*s,) = s for all 
ae*N—N. 


1973] WHAT IS NONSTANDARD ANALYSIS? 59 


Theoren 6.2 also tells us immediately that if the limit exists it is unique. Further- 
more, Theorem 6.1 shows that every convergent sequence is bounded. 


EXAMPLES 6.3. (i) If one wishes to show that lim, coy n =1, then set 


s, = ¥/n—1 (n = 1,2,---) and observe that 
n . n k n 2 > 
n=(1+s,)" = & s > s, for all n = 1,2,--. 
k=0 k 2 


Hence, 0 Ss, S /2n—1) for all n>1, and so, also 0 < *s,, < J 2/(m—1) for 
all 1 < me*N. In particular if @e*N —N is infinitely large, then 0 S *s,, S$ ./2/(@—1) 
and J2K(@— 1) eM, implies that *s, = ,0, and so, by Theorem 6.2, lim,.,.s, = 0, 
and the proof is finished. 

(ii) (Algebra of limits). The usual rules for calculating with limits are now easily 
obtained. For, if lims, = s and limt, = t, then *(s + 1), = *s, + *t, =,5 +t for 
all we *N —N and so lim,,..,(s, + t,) = Ss +t. Similarly, st(*(st),,) = st(*s,,*t,) = 
st(*s,,)st(*t,,.) = st for all @e*N — N shows that lim,.,,s,t, = st. In the same way 
one shows that if t # 0, then lim,., s,/t, = s/t. 

(iii) It is well known that if lim,.,,,s, = s, then 

lim 227 Sn Lg. 

n> 0 n 
The proof of this result in nonstandard analysis reads as follows. From lim,.,..s, = 8 
it follows first of all that for some 0O<reR, | *s, _ | <r for ail ne*N (Theo- 
rem 6.1) and *s, —s =,0forallne*N —N.Nowletwe*N — N and let @) = [Jo]. 
Then the following simple estimation gives the required result: 


atte _ 5) [*si—s] ++ +[*Soo—3| 1 
~ < 


[*Sun+1 — S| +7 +] *50— 5| 
@ 


oO-@ 
Ss + e= 20) max(|*s, —s|:@o <n S ow) =,0, 


by Theorem 5.4. 
Cauchy’s criterion for convergence in analysis takes on the following form. 


THEOREM 6.4. A sequence {s,:n = 1,2,---} of real numbers of R is convergent 
if and only if *s,, =, *s, for allw,w'e*N—N. 


Proof. From Cauchy’s criterion |s, —s,,|<¢ for all n,m sufficiently large it 
follows as in the proof of Theorem 6.2 that the condition is necessary. In order to 


60 W.A.J. LUXEMBURG [June-July 


prove that the condition is sufficient we have only to show in view of Theorem 6.2 
that *s,, is finite for all@e*N — N. To this end, assume that there exists an infinitely 
large natural number @,e*N — N such that *s,, is infinite. We define now the 
following set A = {n: *N and |*s,, — *s,| < 1} of natural numbers. From Theorem 
3.10 it follows that A is internal. Furthermore, by hypothesis *N—NcA. If 
néN is finite, then |*s,,| S |*sa, — *s,| +|*s,|€Mo shows that n¢A, and so 
A = *N—N. Contradicting the fact that *N — N is not internal (Theorem 5.1), 
and so *s,, is finite for all ae *N — N, and the proof is finished. 


REMARK. The above proof shows also that an infinite sequence {s,:n = 1,2,---} 
is bounded if and only if *s,, — *s,, is finite for all @,w’e*N —N. 

The following result of A. Robinson (see [9]) concerning internal sequences 
will be used in Section 9. 


THEOREM 6.5. Let {a,:neé*N} be an internal sequence of real numbers such 
that a, is infinitely small for all finite née N. Then there exists an infinitely large 
natural number wE*N —N such that a, = ,0 for alln So. 


Proof. Consider the internal sequence {na,:ne*N} and let A = {n:ne*N 
and Vk[ke*N Ak Sn] > kla,| $1}. Then, by Theorem 3.10, A is internal. 
Since the hypothesis a,=, 0 for all finite ne N implies na,=, 0 for all finite ne N 
it follows that N < A. Since, by Theorem 5.2, the set N is external and since A is 
internal, A — N # @. Hence, there exists an infinitely large natural number we A. 
Then for all infinitely large  <@ the condition n|a,| <1 implies that 
0 <|a,| < 1/n =, 0, and the proof is finished. 


7. Sequences that are asymptotically linear. A standard sequence of real numbers 
{s,:n = 1,2,:--} is called asymptotically linear whenever there exists a real constant 
oé€R such that s, = no + o(n), neN. 

A now classical result of Polya and Szegé states if a sequence {s,:n = 1,2,---} 
is almost additive, that is, there exists a constant s such that | Sym —s | Ss 
for all n,m = 1,2,-+-, then {s,} is asymptotically linear. 

As another illustration of the use of infinitely small and infinitely large numbers 
we shall prove here in a nonstandard fashion the following slightly more general 
result. 


n Sm 


THEOREM 7.1. Let {s,:n = 1,2,---} be a standard sequence of real numbers 
for which there exist constants p, s such that 0<p<1 and | Snam — Sn — Sn| 
S s(n? + m?) for all n,m = 1,2,---. Then there exists a constant o€ R such that 
|s, — no | < sn?/(1 — 2?-*) for alln = 1,2,---. In particular, {s,} is.asymptotically 
linear. 


Proof. From the hypothesis it follows immediately that for all k = 1,2,--- 
and for all n = 1,2,--- we have 


1973] WHAT IS NONSTANDARD ANALYSIS? 61 


1 — 2607 Dk 


< sn? ——__— 
S sn® 455-7: 


S2kn 


(7.2) 2 


— S, 


Then it follows from the F.T. that, by passing to *(R), (7.2) holds for all k,ne*N. 
In particular, if kK = we*N — N is infinitely large, then 


1 — ye-l)e 


< sn? ——_—— for all ne*N. 


(7.3) 1 — 2P-1 


— *s, 


Since 0 < p <1, and @ is infinitely large, 2°~ is infinitely small, and so 
for all finite n we see that *s,..,/2° is finite. Let 


*Saap * 
a, = — 2h, ne*N. 


Then the internal sequence {a,: ne *N} satisfies, by hypothesis, the condition that 
| nm — Aq — Am| < s2~°(n? + m?) for all n,me*N. Since a, is finite for all 
finite ne N we obtain by setting t, = st(a,), ne N, that | tam —t,— tm | = Q, that 
is, t, = nt; =no, n= 1,2,---. Finally, if we take standard parts in (7.3) keeping 
n finite we obtain that |S, — no | < sn?/(1 — 2?-'), and the proof is finished. 


8. Continuity and differentiability. Let f be a real-valued function ofa real variable 
which is defined on an open interval a < x < b of R. On passing to *(R) the function 
f extends to a function *f whose domain of definition is the open interval a < x < b, 
xeé*R and with values in *R. Furthermore, we have to keep in mind that the F.T. 
implies that *f satisfies in *(R) all the properties of f as far as they can be expressed 
by sentences of Kg. 

For instance, if for some a < Xg < Db, lim,.,,f(x) = | holds, then the following 
sentence belongs to Kg. 


(Ve)[0 <ceR] > (36)[0<dER] A (Vx) [xER AD <|x—Xo| <5] 
> [|f(x) - 1] <e]. 


Using the same methods as in the proof of Theorem 6.1 we obtain immediately 
the following result. 


THEOREM 8.1. lim,..,,f(x) = | if and only if *f(x9 +h) =,1 forallO # heM,. 
In particular, f is continuous at Xq if and only if *f(x9 + h) =,f(X9) for allhe M,, 
that is, equivalently, st(*f(a) = f(st(a)) for all ae *R such that st(a) = xg. 


The derivative of f at x, exists if and only if 


tim £60 +4) — fo) 
y+0 h 


exists. Thus, by Theorem 8.1, fis differentiable at x, if and only if there exists a con- 


62 W.A.J. LUXEMBURG [June-July 


stant |e R such that 
*F(X9 th) — *f(Xo) ] 
nt an =, 


for all 0 # he M,. As we might have expected the derivative of a differentiable 
function is the standard part of the quotient of infinitesimals 


Af _ *f(e + Ax) — f@) 
Ax Ax ‘ 


where Ax # 0 denotes an infinitesimal. 

If f is differentiable at x,, then f is continuous at x,, Indeed, from 
*F(x9 +h) —f(Xo) =,h f'(X%o) for all 0 # he M, it follows, using hl =, 0, that 
*f(x9 +h) —f(x%9) =, 0 for all he M,. 

A real function f defined on an arbitrary interval is uniformly continuous when- 
ever for every 0 < ce R there exists a constant 0 < 6e€R such that | f(x) —f( y)| <é 
for all x, yedom/ and | x — y| <6. In passing to *(R) we obtain immediately the 
following criterion for uniform continuity. 


THEOREM 8.2. Let f be a real function of a real variable. Then f is uniformly 
continuous if and only if *f(a) =, *f(b) for all a,bedom*f and a =, b. 


From the above results the following famous theorem of Heine can now be 
obtained immediately. 


THEOREM 8.3. (Heine). Let f be a real function of a real variable defined on 
the bounded and closed interval x, S x S X,, X1,,x,ER. If f is continuous, then 
f is uniformly continuous. 


Proof. Let a,be*R satisfy x, S a,b S x, and a =, b. Then a,beé Mg, and 
x = st(a) = st(b) satisfies x, S x S x,. Since f is continuous we have, by Theorem 
8.1, that *f(a) =, f(x) =, *f(b), and so *f(a) =, *f(b), that is, by Theorem 8.2, 
f is uniformly continuous, and the proof is finished. 

For a more detailed account of the theory of real functions of a real variable in 
non-standard analysis we refer the reader to [3] and [8]. 


9. Euler’s product for the sine function. On passing from R to *(R), the ele- 
mentary functions of the calculus such as the functions logx, e*, sinx, cosx, and 
so on, extend to functions defined in *R and which have the same properties as their 
standard counterpart as far as they can be expressed by sentences of Ky. In order 
to simplify the notation we shall not use the *-notation to denote the extensions 
of the elementary functions. Thus, for instance, in place of writing *(sin)(x), xe*R, 
we simply write sinx, x €*R. For a discussion of the elementary functions of *(R) 
we refer the reader to [3]. 


1973] WHAT IS NONSTANDARD ANALYSIS? 63 


One of the many beautiful formulas which were discovered by Euler is the so- 
called product formula for the sine-function. By this we mean the following formula. 


2 


(9.1) sinz =z |] (1 — aa) z is complex. 
k=1 


Nowadays this representation for the sine function belongs to that part of function 
theory that studies the behavior of entire functions whenever its zeros are given. 
There one learns that the quotient of the functions on the left and right-hand side 
of (9.1) is a function of the form e’, where f is entire. The whole problem is then 
to determine f, and, in fact, to show that f = 0 in the case of the sine function. 
There are many proofs known for this result. Some of the proofs are even elementary. 
But all of these proofs are somewhat artificial in the sense that they rely on some 
analytical trick. It is therefore not without interest to examine how Euler proved 
his formula. As far as the author knows, Euler’s original proof is contained in his 
book Introductio ad Analysin Infinitorum which appeared in 1748. It runs as fol- 
lows. The mathematical expressions such as ‘‘infinitely large’’ and ‘‘infinitely close’’ 
which occur in it are Euler’s and not the author’s. 
For infinitely large values of n we have 


(9.2) 2sinhx = (1+ =) -(1-3) 

n n 
We are now going to factorize the polynomial occurring on the right-hand side 
of (9.2), by observing that a" — b" = (a—b)(a—«,b)-:-(a—é,_,b), where 1, 
E1,°'',€,—, are the nth roots of unity. Now combine the pairs of complex conjugate 
roots to obtain the real quadratic polynomials 


(< — bexp (=) (< —b exp ( — —)) = a* + b* —2ab cos, 


n 


and so, since a? + b? = 2 + (2x?/n”) and 2ab = 2 — (2x?/n”), we obtain 


2kn 2kn\ x? _ kn x? 
2(1 - cos =) +2(1 + cos) n2 = 4sin n (1 + wea) 
It follows that the polynomial is divisible by x and for all values of k = 1,2-:-, 
by 1 + {x?/n? tan?(kz/n)}. Since n is infinitely large this factor is infinitely close to 
1 + (x?/k?n?). Furthermore, it is easy to see that the coefficient of x is equal to 2, 
and so we obtain that 


00 2 
(9.3) sinhx =x [| (1 + a) 
k=1 


Finally, by applying it for x = iz, the required formula is obtained. 


64 W.A.J. LUXEMBURG [June-July 


The reader who has read this far will agree with the author that Euler’s proof 
is a typical example of the way infinitely large and infinitely small numbers were 
used with great success in the early stages of the development of the calculus. It 
is, however, no wonder that the inability to give the theory of infinitely large and 
infinitely small numbers a firm foundation led to the unacceptibility of such proofs. 
Of course, it is no problem at all with the methods of nonstandard analysis to 
make Euler’s proof precise. 

From Theorem 6.1 it follows that for all standard x € R and for all infinitely large 
natural numbers we*N — N we have 


(9.4) 2sinhx =,(1 + 2) - (1-2) 
@ @ 


Factorizing the polynomial as before leads to the formula. 


m m gm/2 [(m—1)/2] [(m—-1)/2] 2 
(9.5) (1 +3) -(1 - =) = | > sin?) py (1 +—). 
m m m 


m k=1 = kr 
k= m? tan? — 
m 


for all ae*R and for all me*N, and where [(m-—1)/2] as in Section 4 denotes the 
largest natural number S (m—1)/2. Dividing by a # 0 and letting a = 0 shows that 


qm/2 [(m=1)/2] 
(9.6) 


Tt 
sin?— = 2 for all me*N. 


Thus we obtain finally that 


on (1#3f- (2) = 2°" ( rate 
° 4) 40) 7 k=1 Hana} ‘ 


or all xe R and for allwm@e*N —N. 
We shall now prove the following lemma. 


LEMMA 9.8. If xe R is standard, then for all infinitely large we*N—N 
we have 


((a-1)/2] x2 \ ore) x2 
t 1 een — an 
( i ( + imran) = I (1+ Ge): 


Proof. Since for all xeR, the infinite product [[,=,(1 + x?/k?x?) is con- 
vergent it follows from Theorem 6.1 that 


a0 ; x2 [(@-1)/2] x? 
0%) U (1+ ee) - WD (0+ aa) 


for all xe R and for all me *N —N. 


1973] WHAT IS NONSTANDARD ANALYSIS? 65 


Since w? tan? (kn/@) 2 k?n? for all 1 < k S$ [(@—1)/2] we obtain that 


toma oe (1 4% | 2 
— —_—-_______}]] > 
k=1 ( °8 ( * aa) 88 ( r ahaa) ) 20, for all xeR. 


From Theorem 3.10 it follows that the following sequence is internal 


2 


n x2 x 
1 = logj1+——] —- 1 —$—$____—_ * 
(9.10) 7, > (Ios ( + aaa) og (1 + waaay) ne*N andxeR. 


k=1 


If n is finite, then, by Theorem 8.1, the continuity of the log-function and n/w = , 0, 
it follows that 


log(1 + ———*—__ x" 
08 + ital ~1 108 (1 + aa): xER, 


and so n, =, 0 for all finite ne N. Then it follows from Theorem 6.5 that there 
exists an infinitely large natural number v S [(@—1)/2] such that y, =, 0 for all 
nsyv. 

Observing that 


2 


x 
——____-~____. } > — 
log (1 + Z tan? (ka) 5) 20 for all 1 sk s [(w—1)/2], 


we obtain that 


[(@—1)/2] x2 [(o-1)/2] x2 
0 Ss Nt(w-1)/2] s ny + >» log (1 + aa =i >» log (1 + a) 


k=v+1 k=v+1 


From Cauchy’s criterion in the form of Theorem 6.4 it follows, however, that 
we )/2A og(1 +(x?/k?n7))=, 0 for all x ER, and so we obtain that Nt(o—1)/27= 1 9- 
Finally, the lemma follows from the continuity of the log-function. 


In order to complete the proof observe that from (9.4) and (9.7) it follows that 
for all standard xe R we have 


[(o-1)/2] ; x2 \ 
sinhx =,x + Santo we*N—N, 
; nt ( w? tan? (kr/@) 


and so by taking standard parts using Lemma (9.9) we obtain finally that 


ea] x2 
sinhx = x 1 for all xeER. 
i ( r ea) 
From the latter formula the product formula can be obtained by using the 
uniqueness theorem for analytic functions. In this connection it is not without 
interest to remark that a slight extension of the argument presented above will give 


66 W.A.J. LUXEMBURG [June-July 


the result for all complex z # + kz, k = 0,1,2,---. We shall leave it to the reader 
to verify this. 


10. Nonmeasurable functions. In this final section of the present paper we 
shall present a simple example of a function which is not measurable in the sense 
of Lebesgue. The construction or rather the definition of the example will be based 
on the theory of infinitely large and infinitely small numbers. 

In a previous paper [5], the present author already defined such a function. 
It involved some nontrivial properties of the sine-function in *R. We shall follow 
here another idea. 

Let we*N —N be an infinitely large natural number. Then by Theorem 3.10 
the following function is internal. 


(10.1) o(x) = [2°x] —2[2°-'x], xe*R. 


The internal function @ is obviously periodic modulo one. It can also be defined 
as the wth coefficient of the dyadic expansion of x — [x] (x €*R), and so it takes 
on only the values 0 and 1. 

By f we shall denote the restriction of ¢ to the set of standard real numbers R 
of *R. Then the following result holds. 


THEOREM 10.2. The real function f(x) = [2°x]—2[2°-'x], xeR, is not 
measurable in the sense of Lebesgue. 


Proof. Observe that f has the following properties. (i) For every (standard) 
dyadic number d, 0 < d <1, f(d) = 0. (ii) Every dyadic number d, OS d <1, 
is a period of f, that is, f(x + d) = f(x) for all xe R. (ii) For all x, OS x <1, 
we have f(1 — x) = 1 — f(x) provided x is not dyadic. (i) and (ii) follow immediately 
from the fact that since w is infinitely large, 2°d and 2° ~‘d are natural numbers 
for all standard dyadic numbers d, 0 < d < 1. (iii) follows from the fact that f(x) 
is the wth coefficient of the dyadic expansion of x in *R. We shall now assume 
that f is a measurable function. We shall have to use the following well-known 


result. 


(10.3) A measurable function which has arbitrarily small periods is equal 
to a constant almost everywhere. 


From the assumption that f is measurable, property (ii) of f, (10.3), and the fact 
that f takes on only the values 0 and 1 it follows that f = 0 a.e. or f = 1 ae. 

Let A = {x:0 S$ x S$ 1 and f(x) < 1/2}. Then A is a measurable set, and the 
characteristic function of A has all the dyadic number as periods, and so, by (10.3), 
m(A) = 0 or m(A) = 1, where m denotes the Lebesgue measure. Consider now 
also the set B = {x:0 S$ x <1 and f(x) >4}. Then properties (ii) and (iii) of f 
imply that if x is not dyadic 0<x <1, then xeA if and only if 1 -xeB. Thus 


1973] WHAT IS NONSTANDARD ANALYSIS? 67 


the set A, of non-dyadic points of A and the set By of nondyadic points of B are 
symmetric with respect to the point 1/2. Since the set of dyadic points is countable 
its Lebesgue measure is zero and so m(A) = m(A,y) = m(By) = m(B). Then 
Ap VNByp = ©, m(A,pU Bo) S$ 1, mM(Ag) = m(By) and m(A,) = 0 or m(A,) = 1 
imply that m(A9) = m(Bo) = 0. Hence, f(x) = 1/2 a.e., which contradicts the fact 
that f does not take on the value 1/2. We conclude that f is not measurable in the 
sense of Lebesgue and the proof is finished. 


Work on this paper was also supported in part by Grant No. GP-7691 from the National Science 
Foundation. 


References 


1. A. R. Bernstein and A. Robinson, Solution of an invariant subspace problem of K. T. Smith 
and P. R. Halmos, Pacific J. Math., 16 (1966) 421-431. 

2. A. R. Bernstein, [Invariant subspaces of polynomially compact operators on a Banach space, 
Pacific J. Math., 21 (1967) 445-464. 

3. W. A. J. Luxemburg, Non-Standard Analysis. Lectures on A. Robinson’s theory of infini- 
tesimals and infinitely large numbers, Pasadena (1962) and revised edition (1964). 

4, ———_, Two applications of the method of construction by ultrapowers, Bull. Amer. Math. 
Soc., (2) 68 (1962) 416-419. 

5. ———, Addendum to, “On the measurability of a function which occurs in a paper by 
A. C. Zaanen”’, Proc. Roy. Acad. Sci., Amsterdam, A66 (1963) 587-590. 

6. ———.,, Applications of Model Theory to Algebra, Analysis and Probability Theory. Proceed- 
ings of an International Symposium on Nonstandard Analysis, Holt-Rinehart and Winston, New 
York, 1969. 

7. A. Robinson, Non-standard analysis, Proc. Roy. Acad. Sci., Amsterdam, A64 (1961) 432-440. 

8. , Non-standard Analysis, Studies in Logic and the Foundations of Mathematics, 
North-Holland, Amsterdam, 1966. 

9, , On generalized limits and linear functionals, Pacific J. Math., 14 (1964) 269-283. 

10. , On the theory of normal families, Acta Philos. Fenn., 18 (R. Nevanlinna anniversary 
volume) (1965) 159-184. 

11. , On some applications of model theory to algebra and analysis, Rend. Mat. e 
Appl., 25 (1966) 1-31. 

12. , A new approach to the theory of algebraic numbers, Atti Accad. Naz. Lincei 
Rend., (8) 40 (1966) 222-225, 770-774. 


13. , Nonstandard theory of Dedekind rings, Proc. Roy. Acad. Sci., Amsterdam, A70 
(1967) 444-452. 
14. , Nonstandard arithmetic, Bull. Amer. Math. Soc., 73 (1967) 818-843. 


15. R. T. Taylor, Invariant Subspaces in Hilbert and Normed Spaces, Ph. D. Thesis, Calif. 
Institute of Technology, Pasadena, 1968. 


RECURSIVE FUNCTIONS AND HIERARCHIES 
HILARY PUTNAM, Harvard University 


1. Introduction. Everyone knows that modern computing machines can solve 
many mathematical problems. It is true that they do not exercise ‘“‘ingenuity”’ in 
solving these problems: the operator has to give the machine exact rules of procedure, 
and the machine can only proceed by following these rules—the so-called ‘‘prog- 
ram’’—step by step. Moreover, the rules of procedure must themselves be framed in 
terms of a highly limited vocabulary—a vocabulary corresponding, in a certain 
sense, to the elementary operations that the machine is able to carry out. But these 
limitations are compensated for, to some extent, by the immense speed of the machines 
which enables them to duplicate many lifetimes of human computation in a few 
minutes. 

When we attempt to study what such machines can or cannot in principle ac- 
complish, we obviously have to make certain idealizations. It goes without saying 
that we ignore wear and tear on the machines, mechanical malfunctions, etc.—i.e., 
that what we consider are abstract machines. An important part of the idealization 
is this: we pretend that the machines have available a potentially injinite external 
memory. 

To suppose that the external memory is potentially infinite is to suppose that 
there is no limit to the number of filing cabinets full of IBM cards (or whatever) 
that may be placed next to the machine for the machine to use for data storage— 
and also that the machine has the right to ‘‘demand’’ more such external memory 
space if it is necessary for it to store more data. In short, the machine is supposed to 
be able to store and recover arbitrary finite amounts of data. 

When we make these idealizations—no wear and tear on the machine (i.e., the 
machine is ‘‘immortal’’), no mechanical malfunctions, potentially infinite external 
memory—something really remarkable happens. It turns out that the problem 
solving capacity of any one of the standard computing machines is identical with 
the problem solving capacity of any of the others. Of course, one machine may solve 
a problem quickly while another solves it more slowly, depending on the design of 
the machine, but if one machine can solve a problem P at all, any standard compu- 
ting machine can solve it (given enough time). 

Moreover, it turns out that certain very simple machines—very simple to describe 
mathematically—are “‘universal’’ in the sense that they can solve all of the problems 
that any computing machine can solve. These machines are ideal for the purpose of 
studying mathematically just what is and what is not computable in principle. 

Instead of beginning our discussion of computability with a mention of computing 
machines, we might as well have started with the notion of an algorithm. Indeed, a 
computing machine is merely a device for carrying out algorithms (a ‘‘universal’’ 


68 


RECURSIVE FUNCTIONS AND HIERARCHIES 69 


machine is one that can be programmed to carry out an arbitrary algorithm). The 
notion of algorithmic computability has been formalized in several different ways: 
the basic result on which the subject of recursive function theory is founded is that 
the class of functions computable in any of these different formalizations is exactly 
the same. (The use of the term ‘“‘basic result’? to cover the coextensiveness of the 
different formal definitions of algorithmic computability is due to Rogers. 


The thesis that any function that can be computed by following out the steps of 
an algorithm belongs to the class of functions that are computable by using Gédel- 
Kleene systems of equations, or Church /-calculi, or Turing machines (in view of 
the basic result it does not matter which of these notions of algorithmic computability 
we employ to define the class in question), is not a precise mathematical statement 
because it relates the essentially imprecise term ‘‘algorithm’’ to a precisely defined 
term (e.g., “‘function computable by a Turing machine’’). This thesis—Church’s 
Thesis, as it has come to be called—nevertheless plays an important role in motivating 
the study of the class of functions in question and of the formalisms for computability 
just mentioned. The basic result constitutes part of the ‘‘evidence’’ for Church’s 
Thesis: the fact with which we began this lecture, the fact that the functions comput- 
able (or the ‘‘problems solvable’’) using any one of the standard computing machines 
are exactly the functions in this class, constitutes a further piece of evidence for 
Church’s thesis. 


We shall not review the definition of a Turing machine here. We shall, however, 
use the following facts about Turing machines in this lecture: (1) Turing machines 
accept as input finite strings in a fixed finite alphabet. They store data and carry out 
cOmputations by scanning the input and by printing finite strings in the same 
alphabet, if necessary erasing the original input in the course of the computation. 
(In fact, they do all this subject to the restriction that they may only scan one symbol 
at a time, and they may only move right or left one space at a time.) (2) When a 
Turing machine has finished its computation it ‘‘halts’’ (i.e., it comes to rest in a 
distinguished state, the ‘‘rest state’’). The answer to the original problem (assuming 
the machine gave an answer) is then to be found on the tape on which the machine 
prints symbols. For example, if the answer is a number, then that number (in, say, 
unary or binary notation) will be printed on the tape, flanked, if necessary, by 
identifying marks. (There may be other “‘junk’’ left on the tape, but the answer will 
always be distinguished by some such device as the device of an identifying mark at 
the beginning and end.) (3) We assume that the machine has an infinite two-way 
tape on which to compute: this is the special form that the assumption of a ‘‘poten- 
tially infinite external memory’’ takes in the case of this particular kind of computing 
device. However, the original input—what is printed on the tape at the start of a 
computation—is always required to be finite. (4) The class of functions computable 
by Turing machines with a fixed n-sign alphabet is independent of the size n of the 
alphabet, provided n = 2. Here ‘“‘function’’ means ‘“‘function from natural numbers 


70 HILARY PUTNAM [June-July 


to natural numbers’’, and ‘‘computable’’ means that if the machine is given the 
arguments of the function as input (.e., these are printed on its tape in unary no- 
tation), then it will always come to rest in a finite time (which may not be “‘easily 
calculable’’ in advance given the arguments), and the correct value of the function 
for the arguments given as input will be found printed on its tape. (Also, it is required 
that this value be the only number printed on its tape when it halts, or the only 
number flanked by the distinguishing mark, if a distinguishing mark is used.) (5) One 
can describe all possible Turing machines over a fixed n-sign alphabet by means of 
certain canonical expressions. These expressions can themselves be coded in the 
n-sign alphabet (they can even be coded as natural numbers), and a suitable Turing 
machine with that alphabet can even enumerate all of the canonical descriptions, 
in that coding. Henceforth, we shall imagine that a fixed n-sign alphabet (n equal to 
or larger than 2) has been picked, and that a fixed coding of the canonical descriptions 
of the Turing machines over that alphabet has been picked, and that the coded 
canonical descriptions, in the order in which a certain Turing machine lists them, are 
C,,C,,C3,---. We shall also use the expression J; for the Turing machine whose 
description is C; (i = 1,2,3,---): thus T,, T,, T;,--- is a list of all Turing machines 
(over the fixed alphabet). 

Let A be an infinite set of “‘yes-no’’ questions, or an infinite set of numerical 
questions (questions to which the answer is a natural number). (We assume the 
questions in the set A to be expressions in the fixed n-sign alphabet.) If a Turing 
machine can be programmed to write down the correct answer (‘‘yes’’ or ‘“‘no’’, or 
the appropriate natural number) whenever the machine is given a question belonging 
to the set A as input, we say the decision problem for A is solvable. Otherwise, we 
say that this problem is unsolvable. 

Historically, the first interest of recursive function theorists was in showing 
that various decision problems were unsolvable. ‘‘Negative solutions’’—proofs of 
unsolvability—were the hallmark of the field. While recursive function theory today 
covers many other topics, and while its methods are so widespread and so integrated 
with other branches of mathematical logic that it is difficult to say where recursive 
function theory ends and, say, set theory begins, the study of unsolvable problems 
remains an important special topic. 

Proofs that a problem is unsolvable by specified means are not new in the history 
of mathematics. Omar Khayam had already conjectured that certain algebraic 
equations are not solvable by means of radicals (indeed, he thought mistakenly that 
this already happens at degree three), and Galois showed that in fact the general 
fifth degree equation is not solvable. Showing that a problem is not solvable by 
means of Turing machines is importantly different from showing that it is not solvable 
by means of radicals, however. If an equation is not solvable by means of radicals, 
we can still express the solution in other ways, and there are perfectly good methods 
for obtaining this solution to as many decimal places as we wish. But if a decision 
problem is not solvable by means of Turing machines, we are stuck—there are no 


1973] RECURSIVE FUNCTIONS AND HIERARCHIES 71 


more general uniform procedures. In this sense, the unsolvability proofs of recursive 
function theory are the first proofs of absolute unsolvability in human history. 
(Of course, the statement that there are no more general uniform procedures than 
Turing machines is a form of Church’s thesis; thus this last remark presupposes that 
thesis.) 


2. The basic concepts of recursive function theory. The functions computable by 
Turing machines are called recursive. fis recursive if and only if the decision problem 
for questions of the form “‘f(n) = ?” is solvable. 

A set of natural numbers is called recursive if and only if the characteristic 
function of the set is recursive. A is recursive if and only if the decision problem for 
questions of the form ‘“‘Does n belong to A’’ 1s solvable. 

An important distinction is the distinction between recursiveness and recursive 
enumerability. A set A of natural numbers is called recursively enumerable if A is 
either empty or the set of values taken on by some recursive function. In terms of 
Turing machines what this means is just that it is possible to program a Turing 
machine so that the machine will print out all and only the natural numbers in the 
set A. (We count the machine as having “‘printed out’’ a natural number just in case 
that natural number appears on its tape with the fixed distinguishing mark at the 
beginning and end at some stage. The requirement that the distinguishing mark 
appear at the beginning and end enables us to distinguish what the machine actually 
‘‘prints out’’ from the ““data’’ that the machine stores and from its ‘‘calculations’’.) 
Of course, if A is an infinite recursively enumerable set, we must allow the machine 
to run forever so that it can print out all the members of the set A. In short, the 
machine generates the members of A as an infinite list (not necessarily in any natural 
order). 

If the ‘‘basic result’? of recursive function theory is the coextensiveness of the 
different definitions of algorithmic computability, then the following result might 
be called the Hauptsatz of the subject: There exists a recursively enumerable set of 
natural numbers which is not recursive. 

Let us give an example. Imagine a machine M which writes down all finite 
sequences of well-formed formulas of quantification theory (i.e., of elementary 
formal logic, including symbols for relations and symbols for the quantifiers ‘‘for 
all x’? and ‘‘there exists an x such that’’). Such a Turing machine obviously exists. 
If now we complicate the machine by having it test each finite sequence, to check if 
that finite sequence is or is not a proof on the basis of the rules of some particular 
standard system of quantification theory, and to print the distinguishing mark before 
and after the last line (i.e., the last formula in the finite sequence) if and only if the 
finite sequence is a proof, then what will the result be? Clearly the machine M will 
print out only theorems of quantification theory; and since every theorem is (by 
definition) the last line of some proof, M will print out each theorem of quantification 
theory. Thus the set TH of all theorems of quantification theory is a recursively 
enumerable set. 


72 HILARY PUTNAM [June-July 


However, Church proved in 1936 that the decision problem for TH is unsolvable. 
(By the ‘‘decision problem for TH’’ we mean, of course, the decision problem for 
questions of the form ‘‘does F belong to TH”’’, where F is a well-formed formula 
of quantification theory.) Thus TH is not a recursive set. 

(By the above argument, the set of theorems of any formal system is a recursively 
enumerable set; but it is not a recursive set unless the formal system has a solvable 
decision problem.) 

This example has the defect that TH is a set of formulas and not a set of natural 
numbers, but this is inessential: it is trivial to code formulas of any formal system as 
natural numbers, and thus one can obtain a set of integers with the same property 
as TH: the property of being recursively enumerable but not recursive. (It is only 
necessary that the code employed be ‘“‘mechanical’’, that is, that the transition 
between a formula and its number, or between a number and the formula it encodes, 
be capable of being carried out by a Turing machine. But any code that would 
naturally occur to one has this property.) 

In addition to the concepts just defined, the concept of a partial recursive 
function plays an important role in recursive function theory. A partial function 1s 
a function whose domain is a subset of the natural numbers and whose range is a 
subset of the natural numbers. (Thus ordinary ‘‘total’’ functions are also partial 
functions, and the “‘totally undefined function’’—1.e., the “‘‘empty’’ function—is a 
partial function.) If the domain of a partial function f does not include a natural 
number n, then f is said to be undefined on n. 

A partial function f is called partial recursive just in case a Turing machine can 
be programmed to print out all and only the true statements of the form “‘f(n) = m’’. 
This is equivalent to requiring that it be possible to program a Turing machine so 
that (1) it answers correctly all questions of the form ‘‘f(n) = ?’’, where n is a 
natural number on which f is de,.ned; and (2) the machine runs forever (without 
printing out an answer) when confronted with the question ‘‘f(n)=?’’ if n is a 
natural number on which f is not defined. 

The domain of a partial recursive function need not be recursive, but it is clear 
from the above definition that it must always be a recursively enumerable set. 


3. Unsolvable problems. As already remarked, the first interest of recursive 
function theorists was in showing that various decision problems were unsolvable. 
Once one has succeeded in showing that one decision problem is unsolvable—say, 
the decision problem for questions of the form ‘‘does n belong to A’’, where A is 
some fixed set of natural numbers—it is not difficult, in general, to go on and show 
that many other decision problems are unsolvable. For example, suppose that I can 
show that the decision problem for the set A would be solvable if the decision problem 
for a certain set B were solvable. I might do this, for example, by showing that there 
is a recursive function f such that, for all n, neéA if and only if f(n)eB. Since I 
already know that the decision problem for A is unsolvable, it follows that the 


1973] RECURSIVE FUNCTIONS AND HIERARCHIES 73 


decision problem for B is likewise unsolvable. In this way a great many mathematical 
problems have been shown to be unsolvable. The decision problem mentioned in the 
preceding section—the decision problem for the set TH was shown to be unsolvable 
by this method. 

But this method presupposes that some problem other than the one whose 
unsolvability one wishes to establish is already known to be unsolvable. How was 
the first problem shown to be unsolvable? 

The answer turns on a basic theorem of recursive function theory—the enumera- 
tion theorem. This will now be sketched. 

For i= 1,2,3,-:-, let W; be the set of all natural numbers T; ever prints out. 
(Recall that T; is the ith Turing machine in the enumeration described in §1.) Since 
every recursively enumerable set is the set of all integers printed out by some Turing 
machine, it follows that W,, W,, W;, --- isan enumeration of all recursively enumerable 
sets. Let J(x, y) be any one-one recursive function mapping pairs of integers onto 
integers (for the sake of definiteness, take J(x, y) = 2*3”). Then the enumeration 
theorem states that the set of integers J(x, y) such that x € W, is recursively enumer- 
able. 

Whenever {J(x, y) | P(x, y)} is recursively enumerable (where P(x, y) is a two-place 
predicate of natural numbers), we say that the predicate P in question is a recursively 
enumerable two-place predicate. In this terminology, what the enumeration theorem 
says is that the two-place predicate ‘‘x ¢ W,”’ is a recursively enumerable predicate. 


The proof is a direct construction of a Turing machine which lists all possible 
computations of all possible Turing machines and examines these computations to 
discover all truths of the form ‘‘x € W,”’’. 


A second theorem of recursive function theory is this: a set A is recursive if and 
only if A and A are both recursively enumerable. It is clear that if A is recursive 
(i.e., a Turing machine can answer all questions of the form ‘“does n belong to A’’), 
then A and A are both recursively enumerable. To verify the converse, let M, be a 
machine which generates the members of A and let M, be a machine which generates 
the members of A. Then it is possible to construct a Turing machine M,; which 
operates as follows: M, lists all possible computations of both M, and M, (ina 
suitable encoding). Suppose someone gives M, a question of the form ‘‘does n 
belong to A’’. Then M, lists all computations of M, and M,, as we just said, and 
scans the “‘print out’’ of M, and the “‘print out’’ of M, until one or the other machine 
prints out the given integer n. If M, ever prints out n, then M, prints out “‘yes’’ and 
halts as soon as it comes to that computation of M,; if M, ever prints out n, then 
M, prints out ‘“‘no’’ and halts as soon as it comes to that computation. Since n must 
belong to either A or A (and not to both), either M, or M, must eventually print 
out n, and so M; must eventually answer the question. 


Let D={x|xeW,}. Since xeD if and only if J(x,x) belongs to the set 
{J(x, y) | x éW,}, and this latter set is recursively enumerable, it is easily seen that D 


74 HILARY PUTNAM [June-July 


is recursively enumerable (just program a Turing machine to scan all computations 
of the machine that enumerates {J(x, y)|xe W,} and to test the print out of that 
machine to see if the number printed out is of the form J(x,x)—this is easy to 
determine effectively. Instruct the machine that it is to print out those numbers that 
it examines that pass this test—i.e., those numbers that are (1) printed out by the 
original machine, and (2) of the form J(x, x).) But D is not recursive. For suppose it 
were. Then D would be recursively enumerable—i.e., there would be an integér k 
such that D = W,. Then, 


k € Diff k does not belong to D 
iff it is not the case that ke W, 
iff it is not the case that ke D 


—a contradiction. 


The set D—i.e., the set of all x such that x e W,—is thus an example (in fact, the 
basic example) of a set which is recursively enumerable but not recursive, and the 
decision problem for D is the basic example of an unsolvable problem. The un- 
solvability of the decision problem for D was the basic tool used in establishing the 
unsolvability of the decision problem for quantification theory, the decision problem 
for number theory, and many other problems. 


4. Relative unsolvability. Let A and B be nonrecursive sets of natural num- 
bers. In other words, let the decision problem for A and the decision problem 
for B (i.e., the decision problem for questions of the form ‘‘does ne A’’ and the 
decision problem for questions of the form ‘‘does ne B’’) both be unsolvable prob- 
lems. It may still happen that one of these two decision problems is ‘‘more un- 
solvable’ than the other; and the study of relative unsolvability—of ‘‘more’’ and 
“‘less’’ unsolvability—has become a big topic in recursive function theory. 

What is meant by saying that the decision problem for B is more unsolvable 
than the decision problem for A is that 

(1) If B were recursive, then A would be recursive; but 

(2) It is not the case that B would be recursive if A were. 

But how can we give (1) and (2) a precise meaning? 

What we want to formalize is the following notion: that a Turing machine could 
answer all questions of the form ‘‘does ne A’’ correctly, provided the machine had 
access to an ‘‘oracle’’ for B—that is, to a device which answered correctly all. questions 
of the form ‘‘does ne B’’. One simple way of doing this is the following: modify 
the notion of a Turing machine so that a Turing machine can scan and move two 
tapes instead of one. Let one of these tapes contain at the beginning of the compu- 
tation only the particular question ‘‘does ne A’’ which the machine is supposed to 
answer. This is the tape on which the machine will perform its calculations and 


1973] RECURSIVE ‘FUNCTIONS AND HIERARCHIES 75 


print its final answer. Let the other tape contain all true statements of the form 
‘‘n € B’’ and “‘it is not the case that ne B’’. (Apart from the contents of the second 
tape, the machine is still describable by a finite canonical description, just like an 
ordinary Turing machine. Moreover, the machine can scan either tape only one 
square at a time, and can move either tape only one square at a time. Thus it can, in 
fact, only use finitely much information about the set B in any one computation, 
although it has all of the set B potentially available to it). Call such a machine a B- 
machine. Then call the set A recursive relative to B (or “‘recursive in B’’) just in 
case it is possible to program a B-machine to answer correctly an arbitrary question 
of the form ‘‘does ne€ A’’. Finally, give (1) and (2) a precise meaning by saying: 

(3) A is more unsolvable than B just in case B is recursive in A and A is not 
recursive in B. 

Another important case is the case in which A and B are recursive in each other. 
In this case we say that A and B are Turing equivalent. 

We now introduce the notion of a degree of unsolvability : 

(4) A degree of unsolvability is the class of all sets of natural numbers which are 
Turing equivalent to some given set A. 

Here we are exploiting the fact that Turing equivalence is an equivalence relation; 
the degrees of unsolvability are just the equivalence classes of this relation. 

(5) If d, is the degree of unsolvability of a set A (i.e., A is an element of the 
equivalence class d,), and d, is the degree of unsolvability of a set B, then we say 
that d, <d, just in case A is recursive in B. (It is easily checked that this is well 
defined, i.e., that whether d, S$ d, or not is independent of the particular choice of 
the representatives A, B.) 

(6) The degree of the recursive sets (this is the same as the class of all recursive 
sets) is denoted by the symbol 0. Since a recursive set is (trivially) recursive relative 
to every set, 0 <d for all degrees d. 

The structure of the system of degrees of unsolvability is extremely complicated. 
Post and Kleene long ago showed that there are incomparable degrees: i.e., the 
ordering of degrees is not linear. A famous result of recursive function theory is this: 
there are incomparable r.e. degrees, i.e., degrees d,, d, of recursively enumerable 
(‘‘r.e.’’) sets such that neither d, < d, nor d, Sd, holds. This was proved in 1957 
by Richard Friedberg in the course of solving ‘‘Post’s Problem’’ (whether all non- 
recursive r.e. sets have the same degree). 

Let d, and d, be degrees of sets A and B. Let 


C = {J(x,y)|xeA & ye B}. 


(C is a kind of ‘‘recursive cartesian product’’ of A and B.) It is easily seen that the 
degree of C is a least upper bound on the degrees d,, d,; however, it has been proved 
that greatest lower bounds do not in general exist. Thus the partial ordering of 
degrees is not a lattice, but only an upper semilattice. A great deal of work has gone 


76 HILARY PUTNAM [June-July 


into discovering the structure of this upper semilattice, and of the subsystem of r.e. 
degrees. The structure is extremely messy. It has recently been shown, in fact, that an 


arbitrary countable partial ordering can be imbedded in the ordering of degrees of 
unsolvability 


5, Relative recursive enumerability and the jump operator. Let A, B be sets of natural 
numbers. If a B-machine can be programmed to print out exactly the set A (1.e., the 
membership of A is printed out in some order, not necessarily a natural order, on 
the machine’s calculation tape, as the machine goes on running forever) then the 
set A is said to be recursively enumerable relative to B (or “‘r.e. in B’’). 

Intuitively, if A is r.e. in B, then we may say that ‘‘if B were recursive, then A 
would be recursively enumerable’’; or ‘‘a Turing machine could generate the set A, 
provided it had access to an oracle for the set B.’’ The relation ‘“‘r.e. in’’, unlike the 
relation ‘‘recursive in’’ is not transitive. A set A is r.e. in a recursive set just in case A 
is itself r.e. (as is easily seen); but a set which is r.e. in an r.e. set need not be r.e. To 
show this let K = {J(x,y)|xeW,}. Then, for any k, W, = {x|J(x,k)eK}, and 
hence if one had access to an ‘‘oracle’’ for K, one could find out if an arbitrary 
integer n belongs to W, by asking the oracle the single question ‘‘does 2"3* belong 
to K’’. So it is easy to program 2 K machine to answer correctly all questions of the 
form “‘does neé W,’’, for any given k. Therefore, K is an r.e. set in which every other 
r.e. set is recursive; such an r.e. set is called a complete r.e. set. 

(If there exists a recursive function f such that, for arbitrary n, ne A if and only 
if f(n) € B, then we say that A is “‘‘many-one reducible’’ to B. Many-one reducibility 
is only one of a number of different types of reducibilities stronger than Turing 
reducibility that have been studied. The above proof that every r.e. set is recursive 
in K obviously gives the stronger result that every r.e. set is many-one reducible to K.) 

We saw in §3 that D = {x | xEW,} = {x | J(x,x)¢€K} is not recursive. But D is 
obviously recursive in K; so K also is not recursive. 

Now, let C?,C%,--- be any natural enumeration of canonical descriptions of all 
possible B-machines. (The descriptions do not include the contents of the B-tape; thus 
the descriptions are finite, and, in fact, very similar to the descriptions C,,C,,--- of 
the ordinary Turing machines.) Let T3 T,--- be the B-machines whose descriptions 
are (respectively) CP, C2, ---. Let W2 for i = 1, 2,3, --- be the set of all natural numbers 
ever printed out by 7;° Then the predicate xe W, can be shown to be recursively 
enumerable in B (i.e., {J(x,y) | xeE wr} is r.e. in B).—This is called the relativized 
enumeration theorem: the proof is a direct construction of a B-machine which 
writes down all computations of all possible B-machines and examines them in order 
to list all truths of the form ‘‘xe W,’”’ Let. 


K” = {J(x, y) | xe Wy}. 


By the relativized enumeration theorem, K® is r.e. in B. Moreover, every set which 
is r.e. in B is many-one reducible to K? and hence recursive in K¥ (by just the argument 


1973] RECURSIVE FUNCTIONS AND HIERARCHIES 77 


that we gave before to show that all r.e. sets are recursive in K). K® is a complete 
B-r.e. set. But D® = {x| xe W2} is not recursive in B, and so neither is K®. Let us 
now take B = K, i.e., let us consider K*. Since all sets which are recursively enumer- 
able are recursive in K, and K* is not recursive in K, we see that K* is not r.e. 
However, K* is r.e. in K; in fact, it is the complete K-r.e. set, as we just saw. Thus 
we have proved that a set which is r.e. in an r.e. set need not be r.e. 

Generalizing the argument, we see that if B is any set, then K® is a complete 
B-r.e. set, and 


B 
K*® isa set which is r.e. in K® but not 


recursive in K®, and hence not B-r.e. 
Thus 


B 
K* is an example of a set which is r.e. 
in a set which is r.e. in B, but not itself r.e. in B. 


If d is the degree of a set B, we define d’ (the ‘‘jump”’ of d) to be the degree of a 
complete B-r.e. set. (If B is a set of natural numbers, then the notation B’ is also used 
for the set K®. In this notation K* is denoted as K’, K* as K”, etc.) Since any two 
complete B-r.e. sets are recursive in each other, it is clear that d’ is well defined (i.e., 
d' is independent of the choice of the particular representative B from the equivalence- 
class d and of the particular complete B-r.e. set). Since 0 is the degree of the recursive 
sets, 0’ is the degree of the complete r.e. set; if d is an r.e. degree, then0<d <0’. 
Post’s problem (referred to before) asked if there were any r.e. degrees other than 0 
and 0’; Friedberg succeeded in showing that there are infinitely many, and even that 
there exists an infinite family of pairwise incomparable r.e. degrees). 


6. The arithmetical hierarchy and the Kleene-Post representation theorem. Suppose 
we wished to classify the sets of integers most commonly encountered in mathematics. 
One way which naturally comes to mind is to classify them according to the structure 
of their definitions. Thus a set which can be defined by a definition of the form 
{n | (x)Rxn}, would be in one ‘‘box’’ of the classification; a set which can be defined 
by a definition of the form {n | (Ex) (y)Rxyn}, where R is a recursive 3-place predicate, 
would be in a different box, and so on. The whole structure looks like this: 


(x1) (Ex2) (x%3)RX1xX2x3n | (Ex1) (x2) (Ex3)Rx1x2x3n 


(x1) (Ex2)Rx1x2n (Ex )(x2)Rx1x2n 


(x,)Rx1n (Ex;)Rx\n 


THE ARITHMETICAL HIERARCHY — FORM OF DEFINING PREDICATE 


(R is here used as a special variable for recursive predicates) 


78 HILARY PUTNAM [June-July 


The symbols ‘‘(x)’’ and ‘‘(Ex)’’ are just the universal quantifier and the existential 
quantifier of formal logic. (‘‘(x)’’ is read “‘for every x’’, and ‘‘(Ex)’’ is read ‘‘there is 
an x such that’’. Thus {n | (x4) (Ex,)Rx,x,n}, for example, is simply the set of all 
natural numbers n which are such that no matter what natural number x, one 
selects, it is possible to find a natural number x,, possibly depending on both n and 
on x,, such that the three numbers x,, x2, n stand in the three-term recursive relation 
Rx,,X,n.) A predicate which consists of n alternating quantifiers of which the first 
is universal and then a recursive predicate is called a II, predicate; a predicate which 
consists of n alternating quantifiers of which the first is existential and then a recursive 
predicate is called aX, predicate. Also, a set which can be defined by a &, (respectively 
II,,) predicate is called a X,, (respectively II,) set. The ‘‘X”’ ‘‘II’’ notation is obviously 
motivated by the traditional analogy between the existential and universal quantifiers 
and the operations of summation and product respectively. Using this notation, we 
can redraw the diagram thus: 


x3 Sets II3 sets 


x» sets II, sets 


II; sets 


2, sets 


Recursive Sets 


THE ARITHMETICAL HIERARCHY 


In view of the elementary equivalences 
—(x)R =(Ex)—-R 
—(Ex)R =(x)—R 


and the fact that the complement of a recursive predicate is recursive, we see that on 
each level, the sets in one box are the complements of the sets in the box immediately 
to the side. Also a recursive set, say {n | Rn}, can be defined by aX, predicate, for 
arbitrary n, e.g., as {n|(Ex,) (x2) (Ex3) + (Qx,) (Rn& x, =x,& x, =xX,& - &X, 
= x,}, where ‘‘(Qx,)’’ is an existential (universal) quantifier if n is odd (even), and by 
a similar trick, by a II, predicate; and by the same argument, every set that can be 
defined by a &,, or II, predicate can be defined by aZ,,, and by aII,,, predicate for 
all k = 1. Thus the levels are cumulative: every set of a given box belongs to both 
boxes on any higher level. 

The Hierarchy Theorem of Kleene asserts that the classification is indeed a 


1973] RECURSIVE FUNCTIONS AND HIERARCHIES 79 


hierarchy: that is (1) Both boxes in every level above the base contain sets not found 
at lower levels; and (2) Every box contains a set not found in the box immediately 
to the side. 

For example, the r.e. sets are exactly the X, sets. The set K therefore lies in the 
box of &, sets; but it does not lie in the box of II, sets, because the II, sets are just 
the sets with r.e. complements, and K does not have an r.e. complement. Similarly, 
K* lies in the box of X, sets, but not in the box of £, sets nor in the box of IT, sets. 

Why did we leave out of our classification sets whose defining predicates have 
the form (x,) (x2)Rx,x,n? 

The answer is that 


{n| (x1) Ge2)Rxyx2n} = {n| O)R(K(x), Lx), n)}, 
where, for natural numbers x, we define 
K(x) = the exponent of 2 in the prime factorization of x, 


L(x) = the exponent of 3 in the prime factorization of x. 


Since the predicate R( K(x), L(x), n) is recursive whenever R is, this shows that any 
set that can be defined with two successive universal quantifiers followed by a 
recursive predicate can be defined with just one universal quantifier. In the same way, 
any number of successive quantifiers of like quality—all existential or all universal— 
can be ‘‘contracted’’ .Thus, in applying successive quantifiers to a recursive predicate, 
we can assume without loss of generality that the quantifiers alternate in quality. 

According to a theorem of formal logic, any predicate that can be defined starting 
from predicates P,, P,,-:-,P, by using the familiar truth-functions (‘‘or’’, ‘‘and’’, 
not’’, etc.) and quantifiers of formal logic can be defined by a ‘‘normal form”’ 
predicate—one consisting of a ‘“‘prefix’’ of a string of quantifiers followed by a 
‘‘matrix’’ which is a predicate constructed out of the given predicates using truth- 
functions alone. For example: 


66 


— ((X;)Px nx, V (EX2)Qxon) = — (%1)P xynx, & — (EX2)Qxon 
= (Ex,) — Px nx, & (x2) — Oxon 
= (Ex,)(— Px\nx,; & (x2) — Qxoen) 
= (Ex))(%2)(—Px nx; & — Qxon). 


But a predicate which is constructed out of recursive predicates using truth- 
functions alone is recursive (computability is obviously closed under truth-functions); 
thus any set that can be defined starting from recursive predicates by using truth- 
functions and quantifiers can be defined by a predicate consisting of quantifiers 
followed by a recursive predicate. If we then ‘‘contract’’ any successive quantifiers 
that happen to be of the same quality (all existential or all universal), in the fashion 


80 HILARY PUTNAM [June-July 


explained above, we get a predicate of one of the forms £, or II, (unless there are no 
quantifiers at all, in which case we have a recursive predicate). Thus every such set 
belongs to our hierarchy. In particular, every set that can be defined in first order 
arithmetic belongs to our hierarchy. (‘‘First order arithmetic’’ is a formal system 
whose variables range over natural numbers and whose undefined predicates are just 
x=y+z and x=y-z.) Since every recursive predicate can be defined from the 
predicates x = y+zand x=y-z using truth-functions and quantifiers (the basic 
trick is a way of coding finite sequences of natural numbers which was contributed 
by Gédel in his epochal paper on undecidable sentences), the converse is also true: 
the Arithmetical Hierarchy is a classification of exactly the predicates that can be 
defined in first order arithmetic. 

The second way of classifying sets of natural numbers that naturally comes to 
mind, at least if one is a recursive function theorist, is the following: 


sets r.e. in complements of 
sets r.e. in sets r.e. in sets 
an r.e. set r.e. in an r.e. set 


sets r.e. In complements of sets 
an r.e. set r.e. in an r.e. set 


r.e. sets complements of r.e. sets 


Recursive Sets 


This is an ‘‘intrinsic’’ hierarchy of sets of natural numbers, i.e., one which is not 
based on the forms of the definitions of those sets. Yet a beautiful theorem due to 
Kleene and Post, the Kleene-Post Representation Theorem, asserts that this is 
exactly the same hierarchy as the Arithmetical Hierarchy. This theorem has a great 
many interesting consequences. Consider the sequence of sets— 


K, K*, K®",-»- (or K, K’, K’,+*") 


for example. K is a complete r.e. set; K* is a complete K r.e. set, and hence ‘‘com- 
plete’’ for the class of sets r.e. in an r.e. set (1.e., every set which is r.e. in any r.e. set 
is r.e. in K, and hence recursive in K*); and, in general, the nth member of the 
sequence is ‘‘complete’’ for the class of sets in the left hand box of the nth level of 
the hierarchy depicted (counting the recursive sets as the 0-th level). It follows from 
the Kleene-Post Representation Theorem that K is also a complete 2, set, K* a 
complete X, set, and, in general, the nth member of the above sequence is a complete 
x, set (i.e., a ZX, set in which every 2%, set is recursive). 


1973] RECURSIVE FUNCTIONS AND HIERARCHIES 81 


We already remarked that even the degrees of r.e. sets do not form a linear 
ordering, and a fortiori, the degrees of arithmetical sets (sets in the above hierarchy) 
do not form a linear ordering. But the following very simple linear ordering of 
degrees: 


0, 0’, 0”, 0”, cee 


(here 0 again denotes the degree of the recursive sets and ’ is the jump operator) at 
least has the property of dominating all arithmetical degrees—i.e., every arithmetical 
set has a degree which is < all sufficiently large degrees in this sequence. 


7. The analytic sets. The predicates J(x,y)¢W,, J(x,y)€W, J(x, y)€ W3,- 
are all the recursively enumerable two place predicates. For a two place predicate R 
is r.e. just in case {J(x, y) | R,,} is r.e. (by definition: this definition captures the 
intuitive idea that a two-place predicate is r.e. just in case the ordered pairs <x, y> 
such that R,, can be generated by a Turing machine); hence just in case 
{J(x, y) | R,,} is one of the sets W,, W2, W3,-:- in the canonical enumeration of 
all r.e. sets. Let W,? be the predicate J(x,y)eW,, for i=1,2,3,--- Then 
Wi, W2, Wz, --» can be regarded as a canonical enumeration of all two-place r.e. 
predicates. 

It may happen that the predicate W? is a simple ordering; this can be expressed 
by the arithmetical condition on the index i: 


(x) (U(x, x) E Wi) & 
(x) (Y) (JG, ye W, & Jy, x) EW) > x = y) & 
(x) (y) (2) (UG, y) € W; & Jy, z) € Wi) > Sx, z) € Wi). 


—Note that by the theorems of the preceding section the clauses of the form 
J(n, m) & W, can be expressed in the form of an existential quantifier followed by a 
recursive predicate (because the r.e. set W, is X,); thus this condition, which we 
shall abbreviate ‘“‘SIMPLE (i)’’ is indeed an arithmetical condition on i. 

A very important set, first studied by Markwald, is the set of indices i such that 
W,? is a well ordering. (This set, the set of indices of r.e. well orderings, was called 
““W’? by Markwald; however, it has become standard recently to use ‘‘W’’ as the 
notation for the set of indices of recursive well orderings instead). This set cannot 
be defined by an arithmetical condition, as Markwald proved, but it can be defined 
by a condition using a quantifier over functions of natural numbers: 


iis the index of an r.e. well ordering = 
SIMPLE(i) &(f) (x) UY + 1), f)) € W) => (Ey) GY + D = £0) 


(i.e., W? is a well ordering if and only if W, is a simple ordering and there are no 
infinite descending chains in the ordering W,’). 


82 HILARY PUTNAM [June-July 


If we adjoin to the above condition the further clause: 
(Ex) (y) (2) UO, z) € W, = Jy, 2) € Wi, 


then the definition becomes a definition of the class of recursive well orderings (or, 
rather, of the corresponding set of indices), for this further clause just says that the 
predicate W,” has an r.e. complement W?, and a predicate is recursive just in case it 
and its complement are both r.e. 

The set of indices of r.e. well orderings, and the set of indices of recursive well 
orderings are examples of analytic sets, that is, sets whose definition requires a 
function quantifier ‘‘(f)’’, or a string of function quantifiers, followed by a predicate 
which is built up out of the function variables and ordinary recursive predicates by 
truth functions, number quantifiers, and composition. The analytic sets have a 
hierarchy theorem which is directly analogous to the hierarchy theorem for arithmet- 
ical sets; but the problem of a representation theorem for these sets is the major 
unsolved problem of hierarchy theory. 

We list some elementary facts about function quantifiers and analytical sets for 
future reference: 

(1) Quantification over two-place functions can be reduced to quantification 
over one-place functions. (Ef) (x) (Ey) (/(x, y) = 0) is equivalent to (Ef) (x) (Ey) 
(f(J(x, y)) = 0), for example, where the quantifier ‘(Ef)’ is over two-place functions 
in the formula to the left of ‘‘is equivalent to’’ and over one-place functions in the 
formula to the right of ‘‘is equivalent to’’. Similarly, quantification over three and 
more place functions can be reduced to quantification over one-place functions. 

(2) Function quantifiers can be advanced ahead of number quantifiers. Let 
Ay f(y) be a notation for ‘‘the function whose value for arbitrary y is f(y)’’. The 
advancing of number quantifiers is based on the fact that (x) (Ef) R(x) is 
equivalent to (Eg) (x)R(A,g(x, y), x). E.g., (x) (Ef) (f(« + 17) = 0) is equivalent to 
(Eg) (x) (g(x, x + 17) = VU) which is equivalent to (Eg) (x) (g(J(x, x + 17)) = 0) by 
(1). Likewise, (Ex) (f)R(J, x) is equivalent to (g) (Ex)R(A,g(x, y), x). (This paragraph 
assumes the axiom of choice.) 

(3) Function quantifiers of like quality can be contracted to one if they are in 
immediate succession. This is based on the fact that (Ef)(Eg) R(/, g) is equivalent to 
(Ef)RA,K(f(y)), 4,L(f(y))), where K and L are the functions we introduced in the 
preceding section. E.g., (Ef) (Eg) (x) (f(x) = g(x + 1)) is equivalent to (Ef) (x) 
(K(f (x)) = LY(* + 1))). Likewise (f)(g) R(f,g) is equivalent to (f)RU,KU(y)), 
AL (y))). 


(4) A function quantifier and a number quantifier of like quality can be con- 
tracted to just a function quantifier, if they are in immediate succession. This is 
based on the fact that (Ef) (Ex) R(f,x) is equivalent to (Ef)RU,K(f(y)), L(f(0))). 
E.g., (Ef) (Ex) (f(x) = x + 3) is equivalent to (Ef) (K(f(LUY(0))) )= LY (O)) + 3). 
Likewise, (f)(x) R(f, x) is equivalent to (f)R(U,K(f(y)), L(f(0))). 


1973] RECURSIVE FUNCTIONS AND HIERARCHIES 83 


(5) Number quantifiers can be reduced to one by using a function quantifier. 
This is based on the fact that, for example, (x) (Ey) (z) (Ew)R(x, y, z, w) is equivalent 
to (Ef) (Eg) (x) (z) R(x, f(x), z, g(x,z)). Contracting the number quantifiers and 
contracting the function quantifiers we obtain the equivalent (E/f)(x) R(K(x), 
K(f(K(x))), L(x), Lf (J(K(x), L(x))))). Alternatively, we could first have taken the 
negation of (x) (Ey) (z) (Ew)R(x, y, z,w), which is (Ex) (y) (Ez) (w) — R(x, y, Zz, W). 
Applying the preceding technique to this negation, we obtain (Ex) (Ef) (y) (w) 
— R(x, y,f(x),w). Negating again, to get something equivalent to the original 
formula, we get (x) (/) (Ey) (Ew)R(x, y, f(x), w). Contracting tne two universal 
quantifiers and the two existential quantifiers, we obtain the eyuivalent (7) (Ey)R 
(L(f)(0)), Ky), KU (LU(0)))), LQ). Thus an arithmetical formuia can be rewritten 
with a prefix of either of the two forms (Ef) (x) or (/) (Ex). 

(6) If both number quantifiers and function quantifiers are present, the number 
quantifiers can be reduced to one without increasing the number of funcuon quan- 
tifiers. Proof: the function quantifiers can be advanced ahead of all the nusnber 
quantifiers by (2). The predicate that follows all the function quantiiiers now begins 
with a string of number quantifiers. If the last function quantitier is existential 
(universal) rewrite this predicate in the form (Ef) (x) (respectively (f) (Ex)) by the 
methods of (5). Then contract the two successive existential (universal) quantiliers. 

We can sum up (1) through (6) by saying that any analytical predicate can be 
written in a form in which the prefix consists of function quantifiers followed by a 
single number quantifier, and in which existential and universal quantifiers strictly 
alternate. If there are n function quantifiers in this form and the first is existential 
(universal) then the predicate is called a %) (respectively II; ) predicate; what we 
previously called ©, and II, predicates, are now called ZY and [1,° predicates, when 
confusion with analytical predicates is possible. The superscript (0 or 1) denotes the 
type of the quantifiers of highest type, in the sense of roughly Russell’s theory of 
types (taking the natural numbers as type 0, however, which is very un-Russellian); 
the subscript n denotes the number of alternating quantifiers of highest type. 

The hierarchy theorem for the arithmetical hierarchy asserts that for every n, there 
is a ©°,, set which is not a IT?,, set (and hence not a X° or a II? set), and also a 
11°, , set which is not a L?,, set. The analogous hierarchy theorem for the analytic 
hierarchy, whose proof will not be reviewed here, asserts that for every n, there is a 
y},, set which is not a II;,, set (and hence not a II} or aX} set), and also a II}, , 
set which is not a D4, Set. 


8. Extending the arithmetical hierarchy. As we saw, the x? sets are just the sets 
that bear the nth power of the relation ‘‘r.e. in’’ to recursive sets, and hence just the 
sets many-one reducible to K". This gives us a very neat characterization of the 
effect of number quantifiers: an existential (universal) number quantifier just takes 
us from a predicate to another predicate (respectively the complement of a predicate) 
which is r.e. in the first predicate. Also, if a class of predicates contains the recursive 


84 HILARY PUTNAM [June-July 


predicates and is closed under truth functions and bounded number quantifiers 
(quantifiers of the forms ‘‘for all x less than y’’ and ‘‘there exists an x less than y’’), 
then every predicate which is r.e. in a predicate of the class is the existential quanti- 
fication of a member of the class. This suggests looking for a similar characterization 
of the effect of function quantifiers.. 

To put it crudely: an existential quantifier ‘‘means’’ r.e. in. What does a function 
quantifier ‘‘mean’’? This question has led to years of research on the part of many 
investigators. 

Since the sets of numbers we can define if we add function quantifiers to first 
order arithmetic are just the analytic sets of numbers, this problem is just the problem 
of a representation theorem for the analytical hierarchy. We have just said what the 
x° sets are; what are the X} sets? 

The obvious way to try to answer these questions is to try to extend the arithmeti- 
cal hierarchy. If we can extend it in some natural way so that the extended hierarchy 
contains all analytic sets of numbers, then doubtless we will be able to ‘‘look’’ at the 
extension and ‘‘see’? what adding a function quantifier does to the place of a set or 
predicate in the hierarchy. 

But how should we go about extending the arithmetical hierarchy? In a sense, the 
entire arithmetical hierarchy is encapsulated in the following w-sequence of sets: 
K, K', K",K",--+. This suggests the drst move to make: try extending this sequence of 
‘‘complete’’ sets into the transfinite. 

The classical approach (due independently to Davis and Mostowski) was to first 
encode the ordinals (up to some fixed countable ordinal) by integers. Let us use 
‘““N,”’ to denote the set of integer ‘‘notations’’ for the ordinal «. (This assumes some 
fixed encoding has been adopted.) Then we may now take: 


DEFINITION I. 


Ko = S. 
Kyat = (K,)’. (Note that K, = K.) 
K, = {J(m, n) | (Ex) (a< 4 & neN, & meK,}, 


(A is a special variable for limit ordinals). 


What this says is that at limit ordinals 1 the corresponding ‘‘complete set’’ K, is a 
kind of recursive union of the previous K,. This inductive definition associates a set 
K, with each ordinal « for which there is a “‘notation’’ in the selected ‘‘encoding’’. 
Davis and Mostowski used a more complicated definition of the sets, leading not to 
a unique set K, associated with each ordinal « for which they had a notation, but to 
a set H,, associated with each notation n. It then had to be proved (as it was later, by 
Spector) that if n, m belong to the same N,, then H,, is Turing equivalent to H,, (for 
the particular system of encoding employed by these authors, the system ‘‘<,”’ of 
Church and Kleene). When this was proved by Spector, it was possible to see that a 


1973] RECURSIVE FUNCTIONS AND HIERARCHIES 85 


genuine hierarchy of degrees of unsolvability associated with ordinals had been 
detined, even though in the definition sets, not degrees, were associated with notations, 
not ordinals. However, it has been pointed out by Luckham and myself, and also 
independently by Enderton, that the above deiinition, leading to a unique set K, 
associated with each ordinal, is essentially equivalent to the Davis-Mostowski 
procedure. 

The set K,, unfortunately, depends not only on the ordinal « but on the notation 
system (the ‘*system of encoding’’) employed. How bad is this? 

The answer is: ‘‘very bad’’. One can get as high a degree of unsolvability as one 
likes already for K,, by choosing a suitable notation system. Since our aim is to 
extend the sequence K’, K”,K”,--- in some invariant way by defining sets K,, 
Kosi Ko+2°"° etc., and notations for ordinals are only to be an auxiliary in this 
task, we find this extremely unsatisfactory. 

Through the constructive ordinals a resolution of this difficulty is available. 
Each constructive ordinal is the ordinal of a recursive well ordering. So a natural 
‘‘notation system’’ for the ordinals less than a, where « is any constructive ordinal, is 
any recursive well ordering of order type a. If R is a well ordering, or even a well 
founded partial ordering, we define the natural map from Field(R) onto the ordinals 
in the appropriate segment as follows: 


(1) If ne Field(R) is R-minimal, then |n [f = 0 (“‘n is an R-notation for 0’’). 

(2) If m, ne Field(R) and m is an immediate R-successor of n, then | m|* 
= |n[F+1. 

(3) If meField (R) is an R-limit, then |m [R = the least upper bound of 
{|n|®| Rnm}. 

It turns out that if we regard any recursive well ordering as a notation system 
in this way, and choose that notation system as the system we use in the above 
definition of the sets K, (writing | n |" = a for néN,), then the degrees of the resulting 
sets K, are independent of «. This theorem (which might be called the external 
uniqueness theorem for recursive well orderings) was proved by Enderton and 
Luckham. The same theorem holds for r.e. well orderings; and this result has the 
Spector uniqueness theorem for the system <g as a special case. 

(The Spector uniqueness theorem is that if n, m are two notations for the same 
ordinal in the Church-Kleene system <g, then the sets H,, H,, are of the same degree. 
The definition of the sets H, is more complicated than that given above for the sets 
K,; but if one takes the notations m in the system <g which bear the partial ordering 
relation <, to a given notation, then (1) the resulting “‘path’’ in the system <g is a 
recursively enumerable well ordering; and (2) the sets H, along the path are of the 
same Turing degrees as the sets K, along the path—.e., thinking of the path as a 
system of notations, and ignoring the rest of <9.) 

The result of Enderton and Luckham justifies the following definition for con- 
structive ordinals « (the ‘‘constructive ordinals’’ were originally defined in terms of 


86 HILARY PUTNAM 


the system <,—in fact, as the ordinals for which there are notations in that system, 
but it is convenient to redefine them simply as the ordinals of recursive well orderings): 

d, = the degree of unsolvability of K,, where K, is the set associated with the 
ordinal « by Definition I if we use an arbitrary recursive system of notations (i.e., 
recursive well ordering or well founded partial ordering) which contains a notation 
for a. 

Thus a true hierarchy of degrees of unsolvability has been associated with con- 
structive ordinals in a satisfactorily invariant manner. 


9. Extending beyond the constructive ordinals. The sets K,, « a constructive 
ordinal, and the sets recursive in them, together form only the hyperarithmetic sets 
(assuming we use recursive systems of notations, as just proposed), and these are 
only the tiniest fraction of the analytic sets—in fact, just the sets in 2 OIIj. Our 
original program was to extend the sequence K, K’, K”,--- until we had (at least) all 
the analytic sets. We have certainly not succeeded in this! At present various attempts 
have been made to extend farther. The farthest extension yet proposed is the hierarchy 
of ‘‘constructible degrees of unsolvability’’ proposed by Boolos and the author [2]; 
to show that this extension, which it goes beyond the scope of the present paper to 
discuss, includes all sets of integers or even all analytic sets of integers, requires the 
assumption of the controversial axiom ‘“‘V = L’’ suggested by Gédel (but not longer 
advocated by him) as a new axiom for general set theory in his paper on the con- 
sistency of the axiom of choice and the generalized continuum hypothesis. In the 
absence of ‘‘V = L’’, or some such axiom it remains consistent that the extension 
referred to does not include all analytic sets of integers. Moreover, even if we 
assume ‘“‘V =L’’, no good representation theorem for the analytic hierarchy 
has yet been proved by anyone. The preceding section of this paper is the barest 
ntroduction to a fascinating and totally unsolved problem; the problem of obtaining 
an understanding of the analytic hierarchy from the viewpoint of recursive function 
theory. 


References 


1. Hartley Rogers, Jr., Theory of Recursive Functions and Effective Computability, McGraw- 
Hill, New York, 1967. This work contains an excellent bibliography of the subject. 

2. George Boolos and Hilary Putnam, Degrees of unsolvability of constructible sets of integers, 
J. Symbolic Logic, vol. 33, no. 4 (December 1968), pp. 497-513. 


FUNCTION THEORY ON SOME NONARCHIMEDEAN FIELDS 
ABRAHAM ROBINSON, Yale University 


1. Introduction. Archimedes’ axiom states that for any two positive numbers a 
and b, a smaller than b, the continued addition of a to itself ultimately yields numbers 
which are greater than b. More formally, if F is an ordered abelian group or, more 
particularly, an ordered field, then Archimedes’ axiom is as follows. 


1.1. If 0<a<_b, where a and b are elements of F then there exists a natural 
number n such that 


ata+-::-+a>b. 
— 


n times 


Throughout the history of mathematics, Archimedes’ axiom has been associated 
with the foundations of the Differential and Integral Calculus. Already in Greek 
science the method which, much later, was dubbed the method of exhaustion and 
which, to a large extent, anticipated the ¢,6 method in the calculation of areas and 
volumes, depended on the validity of Archimedes’ axiom, which was formulated 
explicitly for this purpose. On the other hand, when a method of infinitely small and 
infinitely large numbers is used, as in Nonstandard Analysis, then it is just the non- 
archimedean nature of the system which is essential for its success or, more precisely, 
the superposition of a nonarchimedean field on the archimedean field of real numbers. 

Although Nonstandard Analysis (see [4] or [6]) may perhaps be regarded as 
the most successful effort in this direction, many other systems have been introduced 
for the same purpose. Thus, not long ago, D. Laugwitz [2] considered a theory of 
functions on the field L of generalized power series with real coefficients and real 
exponents. The same field was investigated many years earlier by T. Levi-Civita 
[3], also because of its nonarchimedean character, and by A. Ostrowski (st, in 
Connection with the theory of valuations. 

Laugwitz raised the question whether the functions considered by him satisfy 
the intermediate value theorem and the mean value theorem of the Differential 
Calculus. We shall show in the present paper that although these theorems are not 
valid here in full generality, they are true under rather wide conditions. In order to 
obtain these results, we shall embed L in the residue class field °R of a certain subring 
of a nonstandard model of Analysis, *R. It appears that °R has many interesting 
properties which make it a suitable subject for investigation quite apart from the 
particular problem just mentioned. In particular, the behavior of a function on °R 
is closely connected with the theory of asymptotic expansions, although we shall not 
pursue this topic in the present paper. 


87 


88 ABRAHAM ROBINSON [June-July 


2. Ordered fields and fields with valuation. An ordered field F is a commutative 
field in which an ordering relation x < y (or, equivalently, y > x) is defined and 
satisfies the following conditions. 

2.1. The ordering is transitive, x < y and y < z implies x < z, and irreflexive, 
x<y implies x # y. 

2.2. The ordering is total, if x # y then either x < y or y<x (but not both, 
by 2.1). 

2.3. The ordering is related to addition by the requirement that x < y implies 
x+z<y+z; and to multiplication by the requirement that x <y and 0<z 


implies xz < yz. 


An ordered field can be characterized also by means of the set of its positive 
elements P = {x | x > 0}. Thus, suppose that a subset P of a field F possesses the 
following properties. 

2.4. O¢ P; for allx #0, xeP or —xeP. 

2.5. If x, ye P, then x+ yeP and xyeP. 


Then the relation defined by 
x < yif and only if y-—xeP 


satisfies the conditions 2.1-2.3 and P is just the set of positive elements of the field 
according to this relation. 

We shall suppose that the reader is familiar with the elementary properties of 
ordered fields, e.g., that an ordered field is of characteristic 0 and that x? > 0 for all 
x # 0. As usual, we write x < yor y2>x if either x<yorx = y. 

The rational numbers form an ordered field Q whose positive elements are the 
fractions (ratios) of natural numbers different from zero, and the real numbers form 
an ordered field R whose positive elements are just the squares other than zero. In 
both cases the ordering is unique. Moreover, both Q and R are archimedean, i.e., 
they satisfy Archimedes’ Axiom 1.1. 

Perhaps the simplest example of a non-archimedean field is as follows. Let R(t) 
be a simple transcendental extension of the field of real numbers R. Thus R(t) may be 
identified with the field of rational functions of the indeterminate t with coefficients 
in R, each element of R(t) may be written in the form 


— Pt) — ag taytt+-+a,t" 
(2.6) f= g(t) bo +b yt+--+5,t” 


where q(t) # 0, at least one of the 5, is different from 0. We may then suppose the 
first b; # 0 is actually equal to 1, for if this is not the case from the outset, we may 
achieve it by multiplying the numerator and denominator on the right hand side 
of (2.6) by bj’. Thus, if f # 0, we may write 


1973] FUNCTION THEORY ON SOME NONARCHIMEDEAN FIELDS 89 


atl os tant” a, £0 
k 9 


2.7 = = ——., 
( ) f tJ + b,4,t?** + cee + b,,t” 


Osksn, OSjsm. 


We now determine an ordering in R(t) by defining that f # 0 is positive if and 
only if a, > 0. To make sure that this is a good definition one first has to check that 
it is independent of the particular representation (2.7) chosen for the given f. Next 
one verifies that the set of positive elements of R(t) defined in this way satisfies the 
conditions of 2.4 and 2.5. We suppose that these rather simple tasks have been 
carried out so that R(t) becomes indeed an ordered field with the above definition. 
Moreover, this ordered field is nonarchimedean. For, by our definitions, 0 <t, t <1 
(since 1 — t is positive) and, for any positive integer n, 


ttit--+t 
> n times 


(since 1 — nt is positive). This shows that 1.1 is not satisfied. 

In any ordered field, the absolute value of a number a is defined to be | a =aif 
a = 0, otherwise | a| = — a. Then | ab| = | a| | b| and | a + b| S|a| + | b| (triangle 
inequality). 

Let F be a nonarchimedean ordered field. Then F is of characteristic 0 and, 
hence, contains the field of rational numbers Q. An element a € F is said to be infinite 
if a > q for all geQ. Also, ae F is said to be infinitely small or infinitesimal if 
|a| <q for all positive gE Q. aeF is finite if it is not infinite. This will be the 
case if and only if | a | <q for some geQ. 

The finite elements of F constitute a subring F, of F. The infinitesimal elements of 
F constitute a proper ideal F, within Fy. F, is maximalin Fy, as can be seen by the 
following argument. Suppose that F, < J c Fy where J is an ideal in Fo, such that 
J—F,#@. Let aeJ — F, then a is not infinitesimal. We conclude without dif- 
ficulty that a~! is finite, so a~1€ Fo, aa~'=1eJ. But then J = Fo, F, is maximal 
in Fo. 

It follows that F’ = Fy/F, is a field. F’ is called the residue class field of the 
ordering. The canonical mapping Fo “, F’ induces an ordering in F’ according to 
the rule that, for any ae F’, a £0, a is to be positive in F’ if and only if one (and 
hence, all) of the elements of y~ ‘a is (are) positive. It is not difficult to show that 
F’ is archimedean according to this ordering and (hence) that it is isomorphic and 
order-isomorphic to a subfield of R. 

The cosets of F, as an additive subgroup of.F are called monads. If a is any 
element of F then we denote the monad containing it by (a). In particular, u(0)=F;. 
The monads which are subsets of Fy may be identified with the elements of F’. 

As a tool in our investigation of nonarchimedean fields we shall require also the 
notion of a field with valuation, more particularly, the notion of a field with non- 


90 ABRAHAM ROBINSON [June-July 


archimedean valuation in the real numbers. This concept is given by a field F 
together with a mapping v(x) from F — {0} into the real numbers R such that the 
following conditions are satisfied: 


2.8. For allx #0, y #0 in F, v(xy) = v(x) + v(y). 
2.9. For all x,y in F such that x #0, y#0, x+y#0O, 


u(x + y) = min (v(x), v(y)). 


If we add to R an element oo (usually called ‘“‘a symbol’’) with the rules 
x+o=0+x=0+ 0 =0 and the stipulation that oo > x for all real x, then 
the auxiliary definition v(0) = co ensures that the equations of 2.8 and 2.9 are 
satisfied without any restriction on x and y. 

The set O, = {xeEF | v(x) 2 0} is a subring of F, the valuation ring, and the set 
Jp = {x € F| v(x) > 0} constitutes a maximal ideal in O;, the valuation ideal. The 
field f = O,/J, is called the residue class field of the given valuation. 

Let c be an arbitrary but fixed constant greater than 1. Then the definition of 


distance 
d(x, y) _— cE ¥) 


where c ~ is interpreted as 0, turus F into a metric space. If every Cauchy sequence 
in that space has a limit then F is said to be complete for the given valuation. 

See [1], [7] or [8] for basic facts in valuation theory. From now on such facts 
will be taken for granted. 


3. The field L. The field R(t) is inadequate for the development of the calculus 
because we cannot extend to it even some of the most common functions defined in 
the field of real numbers, e.g., the function y = JX. Passing to the field of formal 
Laurent series D°__,a,t", a,€R, does not remedy the situation. Following 
Laugwitz, we therefore consider the field of generalized power series L, which is 
defined as follows: 

The elements of L are the formal expressions 


(3.1) a a,t*  a,v,ER, v,¢ 00, 
=0 


(where the last symbol implies vg < v, < v, <---). Two expressions (3.1) are, by 
definition regarded as equal if for any term a, which occurs in one but not in the 
other, a=0. We shall also write aot’ + a,t''+--- + a,t™ for an expression for 
which a,4; = 4,42 =°:: =0. 

The sum of two expressions > a,t’* and > b,t** as in (3.1) is the expression 
> c,t** which is defined as follows. The sequence {/,} is the set theoretical union of 
the sequences {v,} and {y,} arranged in increasing order. If a particular 1,, occurs 
both in {v,} and in {y,}, €.g., An =Vp =H, thenc,, =a, + b,; if A, =v, but A,, does 
not occur in {y,} then c,, =a,; and if A,, =, but 4,, does not occur in {v,} then 


1973] FUNCTION THEORY ON SOME NONARCHIMEDEAN FIELDS 91 


Cm = b,. Thus, briefly, the sum & c,t’* is obtained by the formal addition of the terms 
of Lat” and & b,t" Similarly, the product © c,t of La,t™ and & b,t**as in 3.1 is 
obtained by formal multiplication. Thus, the sequence {/,} consists of the sums 
v, +H, arranged in increasing order and c, = 1 a,b, where p and q range over the 
natural numbers such that v, + uw, = ,. It is not difficult to see that all these sums 
are finite and that the resulting expression satisfies the conditions of (3.1). Moreover, 
our definitions of sums and products are compatible with the relation of equality 
introduced earlier, and they turn L into a ring whose zero and unit elements may be 
written as O0f° + Of! + Of? + -:-, or 0, and as 11° + Of! + Of? +---, or 1. 

Now let a=1+ Dy; a,t", O<v, <¥, <-+: 3 0, ie, a is an element of L 
as in 3.1 with v9 =0, ag =1. We wish to show that « possesses a multiplicative 
inverse in L. For this purpose we define f as the formal expansion in powers of t of 
the expression 


oe) 00 2 ore) 3 
1— ( at”) + ( > at”) — ( > at”) +o 
k=1 k=1 k=1 


Again it is not difficult to see that this expansion can be worked out and that it is 
of the form B=1+4+ 2X, b,t"* where 0 < ply < U2. <-> 0, So that B belongs to L. 
We now claim that «B = 1. To see this, consider the identity 


(3.2) Q+yd—-yty2—ype ter typ) att yt! 


which holds in L for arbitrary natural m. We may substitute ,°., a,t’ for y and 
expand on both sides of (3.2). This yields an equation 


(3.3) ap" =», 


where f’ is the expansion of 


roe) oe) 2 oe) 2m 
1- ( » ay”) + ( >> ay” a ( » at”) 
k=1 k=1 k=1 


and y’ is the expansion of 1 +(,°., a,t)°"**. But then P’ differs from B only in 
powers of t whose exponent is at least (2m + 1)v, and y’ differs from 1 only in powers 
of t whose exponent also is at least (2m + 1)v,. Since m is an arbitrary natural 
number, we conclude that oB =1, B=a7'. 

Now let «eL be different from zero, otherwise arbitrary. Then « = dj,29 a,t” 
where we may assume that a) #0. Putting « =agt”® «’ where 


[e @) 
a’ =1+ LD (a,/ao)t™ ”, 
k=1 


we then obtain aj‘ t~”° a’~* as the multiplicative inverse of «. 
Thus, L is a field. We introduce an ordering of L by defining that an element 


92 ABRAHAM ROBINSON [June-July 


aeL, «#0 is positive if and only if the nonvanishing coefficient a, with lowest 
subscript m in the expression « = 2-9 a,t’* is positive. Also, L obtains a valuation 
by defining v(«) = v,, (so that a,,# 0, a, =0 for k < m), for « #0, together with 
v(0) = oo in accordance with our general convention. 

In this valuation, the valuation ring O, consists of all elements of L which can be 
written as ba,t’* with vo = 0, and this is also the ring of finite elements of L in the 
ordering of L; and the valuation ideal J, consists of all dia,t’* with vg > 0 and 
coincides with the set of infinitesimal elements of L. Thus, the residue class field of L 
with respect to its valuation coincides with the residue class field of L with respect to 
its ordering and is, in fact, the field of real numbers R. Also, since J, # {0}, L is 
nonarchimedean. 

There is a natural (and obvious) embedding (injection) of R into L:a->a=at® 
+ Ot! + Ot? + --- and this extends, equally obviously, to an embedding of R[t] into 
L: 


Agta t+--+a,t" > aot +a,t' tant? +--+a,t" + O14... 


and hence, to an embedding of R(t) into L. The embedding is order preserving for 
the ordering of R(t) defined in section 2 above. 

It is shown in [5] that L is complete. It is also shown there that the field L’ which 
is obtained by taking complex coefficients in place of the real coefficients in L, is 
algebraically closed. Since L’ = L( J-1) it follows (compare [7]) that L is real- 
closed, i.e., that every positive element of L possesses a square root in L and that 
every polynomial of odd degree in L[ x] possesses a root in L. It follows in particular 
that a positive element of L possesses roots of all orders n = 2,3,4,---. The same 
result is established by elementary means in [2] and will be used later in this paper. 

Now let f(x) be a real-valued infinitely differentiable function of a real variable 
which is defined in an interval a < x < b,a, be R. On passing from R to L, we find 
that the interval a<x<b in L consists of points x =€+ Di, a,t", 0<y, 
<---— oo, of three kinds, 


(i) a<€<b, 

(ii) €é=a, dX a,t’*>0, and 
k=1 

(iii) E=b, ¥ at <0. 
k=1 


In all these cases € is the unique real number which is infinitely close to x, i.e., such 
that x — € is infinitely small and (by analogy with the terminology in Nonstan- 
dard Analysis) we call € the standard part of x, € = x. 

Laugwitz extends the function f(x) to values of x in L with standard part é, 
a<€<pb by using the formal Taylor expansion of f(x), 


1973] FUNCTION THEORY ON SOME NONARCHIMEDEAN FIELDS 93 
| 
f(xth= 2 = fon". 
n=0 ‘ 


Thus, he defines for x = € + X., a,t”*, 
0 1 (0.6) n 
(3.4) Ve) = E FO (Za) 
n=0 N- k=1 
where it is understood that “f(x) is the element of L which is obtained by expanding 
the right hand side of (3.4) and rearranging it in powers of t. Once again, the condition 

V9 > 0 shows that this can be done. 

We shall show in the following sections that the definition proposed by Laug- 
witz is obtained in a natural way by relating L to a nonstandard model of Analysis. 


4. The field °R. Let *R be a nonstandard model of Analysis (cf. [4] and [6)]). 
We shall suppose that *R is sequentially comprehensive. That is to say, if 
Ap, 41,42,°°', Any °**, NEN, is a sequence of entities of *R (of the same type, if type 
restrictions are adopted), e.g., a sequence of numbers of *R, then there exists an 
internal sequence {s,} in *R (where n now ranges over *N) such that s, = a, for all 
finite n. 

There exist sequentially comprehensive *R. More particularly, all *R which are 
ultrapowers are sequentially comprehensive. Thus, suppose *R = R?/D where D is a 
free ultrafilter on the index set J. Every internal entity of *R is represented by (is an 
equivalence class of) functions f(v) on J. Let f,(v) represent a,, n = 0,1,2,---, and for 
each ve], consider s(v) = {f,(v)}. Then s(v), v ranging over I represents an internal 
sequence {s,} in *R. We claim that for each finite k, the value of that sequence is 
just a,. Now, in order to obtain the value of {s,} for n = k, we have to substitute the 
function f(v) = k for each n in f,(v). This yields precisely f,(v), i.e., a, 

Supposing, from now on, that *R is sequentially comprehensive, we wish to 
show that the set of infinite natural numbers, *N — N, cannot be coinitial with w*. 
In other words: 


4.1. THEOREM. Let dg >a, >a,>°°°>a,>-°:,nEN be a strictly decreasing 
sequence of infinite natural numbers, internal or external. Then there exists an 
infinite natural number a, such that a,>a for all neN. 


Proof. Since *R is sequentially comprehensive, we may suppose that, for all 
neéN, a, =5, where {s,} is an internal sequence of numbers of *R. Consider the 
internal sequence 


n 


= ne*N., 
min(S9,51,°°'s5,) 


th 


Then 0 S$ t, <1 for all finite n but ¢, >1 for large infinite n. Hence there exists 
a smallest m, which must then be infinite, such that 0 < t,, < 1 does not hold. 


94 ABRAHAM ROBINSON [June-July 


Thus, fork =m-—1l, 


k 


———_———— < l. 
min (So, 51, “++, Sy) 


o 
IA 


This shows that k < dg, k < a,,---,k <a,,--- for all finite n and proves the theorem. 
Now let p be an arbitrary but fixed positive infinitesimal number in *R. We 
define subsets My, and M, of *R by 


My = {xe *R| | x| <p" for some finite positive integer n}, 
M, = {xe *R| |x| <p" for all finite positive integers n}. 


Evidently, M, < My, and My > R. Both Mg and M, are rings under the operations 
of *R. For if | x| <p", | y| Sp", with n Sm say, then 


[x +y|S|x|+|y] S2e™ sper 


and | xy| < p~"*™, so Mg is a ring. And if |x| <p", y| < p” then | x + y| < 2p” 
<p"',~, | xy| S p?". Since, in the definition of M,, n is arbitrary, this shows that M, 
also is a ring. 

Moreover, M, is an ideal in Mo, for if xe M, and yeM, then |» <p" for 
some natural number n, and since |x| < p”*" for all natural n, it follows that 
| x y| < p”™ for all natural m, xy € M,. M, isa proper ideal since it does not contain 1. 
Finally, M, is a maximal ideal in Mg. For let J > M, be another ideal in My such 
that J — M, is not empty, and let xe J — M,. Then | | > p™ for some finite natural 
number m and so |x- 1] <p-™, x" 'eMp. Hence 1 = xx! eJ, J = Mo, showing 
that M, is maximal. 

We conclude that the quotient ring °"R =M,/M, is a field. Moreover, the 
canonical map 


(4.2) W:M,—°R 


induces an ordering in °R. For let x € My — M,, x > 0, and let x + y, ye M, be any 
Other element of the coset of x with respect to M,. Then | x| > p™ for some finite 
natural number m and | y| Sp” for all finite natural numbers n. Hence | y| < | x , 
and sox+ty2x—- | y| = |x | — | y| > 0, all elements of the coset of x are positive. 
Accordingly, we may define an ordering in °R by defining that an element ae’R, 
a ~ 0, is positive if and only if the elements of y~' « are positive. Then the sum and 
product of positive elements of °R are positive but 0 €°R is not positive. This shows 
that our definition turns °R into an ordered field. We also observe that for any 
ae °R, wy 1a is an interval in My and *R. Finally, since M, contains only the single 
standard number 0, wR provides an embedding of R (as a subfield of *R) in °R. 
Next, we define a valuation in °R, as follows. For any «e€°R, «a #0, let x and 
x + y beelements of w~ 1a, ye M, and consider log,| x| and log,| x + y|. Since |x| 


1973] FUNCTION THEORY ON SOME NONARCHIMEDEAN FIELDS 95 


and | x + y| are greater than some positive, and smaller than some negative power 
of p, log,|x| and log,|x + y| are finite and possess standard parts. We claim that 


(log,| x|) = °og,|x + y|), 
Le., that 
log, | x + y| — log,| x| = log, |1 + y/x| 


is infinitesimal. But log,| 1 + (y/x) | = In | 1+ (y/x)| /Inp. Since y/x is infinitesimal 
and In | w| is a standard function which is continuous at w = 1, In| 1+ (y/x)| is in- 
finitesimal. Hence log, | 1+ (y/x) | also is infinitesimal, as asserted. 

Accordingly, we obtain a unique definition of a function v(«) for « €°R, « #0, by 
putting v(«) = (log,| x|) for any x Ew ‘a. We claim that this defines a valuation of 
the field °R. 

Let «, BeE°R, «#0, B40 and let xew 1a, pew 'B. Then 


(log, | xy|) = (log,| x|) + %log,| y]) 


and so v(af) = v(a) + v(f), as required. Next, suppose « + B #0, then we have to 
show that v(a + B) = min(v(a), v(B)) or, equivalently, that 


(4.3) (log,| x + y|) = min (log, |x|), °(log,| y|)). 


We may suppose without essential loss of generality that log,| x| >= log,| yl. 
Then (4.3) will hold precisely if there is an infinitesimal 7 such that 


log,| x + y| 2 log, |y| —n, 


l.e., such that 


x 
log, |1 += = —N. 
Putting x/y =w, we have to show log, | 1+ w | = —yn for log,| w | = 0, (where we 
may rule out w= —1 because of «+6 #0). Put ¢ =log,|w|, |w| =p’, where 
o = 0, then 


|ltw|S1+|w| =1+4 p’ S 2p’ = p?*'%? 
log,| 1+ w| 2o + log,2 2 log,2. 


But In, 2 is (negative) infinitesimal, and so (4.3) is proved. We supplement the 
definition of v(x) as usual by putting v(0) = oo. 

The valuation ring of the valuation just defined will be denoted by O,. It is not 
difficult to see that O, includes the y-images of all finite elements of *R. However, O, 
includes other elements as well. For example, let A=wWlInp. Then v(A) = (log, | Inp » 
am (In | Inp | /Inp). But the expression in the parentheses on the right hand side is 


96 ABRAHAM ROBINSON [June-July 


infinitesimal, for Inp is (negative) infinite and 


lim [2% =o, 


Hence v(A) = 0. 

We shall now show that the field °R is complete for the valuation defined above. 
Defining the distance between two elements of ’R, « and B, by d(a, B) = c "*~® (see 
the end of section 2 above) let {«,} be a Cauchy sequence in this metric. 

(4.4) lim d(a,,%,) = 0. 
Then we have to show that {«,} converges to a limit « in °R. 

Choose elements x, Ew ‘a,, n= 0,1,2,---,neEN. Since *R is sequentially com- 
prehensive there exists an internal sequence {s,} of elements of *R such that s, = x,, for 
all finite n. We shall write x, in place of s, also for infinite n. By (4.4) 


lim v(@, — %_) = 00. 
noo 
m— oc 


Equivalently, given any finite natural number k, there exists a finite natural j = j, 
such that 


(4.5) log,| Xn — Xm | >k for n,m>j,, n,meN. 


Now since 4.5 holds for all finite n and m greater than j, it holds for all n > j, 
m>j, n+m finite, 7 =j, A standard argument of Nonstandard Analysis, which 
was exemplified in the proof of 4.1, now shows that there exists an infinite natural 
@ = @, such that (4.5) holds for all n>j, m>j, and n+ m <2, and hence, in 
particular, for all n>j, m>j and n<a@,, m<«@,. Moreover, by determining 
Wp, @;, W2,-°°: one after the other, we may evidently assume that @) > @, > @,>°°. 
Appealing to 4.1, we may then choose an infinite natural number Q which is smaller 
than @ ), @;, @, and—obviously, being infinite, larger than jp, j,, j2, --:. Then, 


(4.6) log, | x, — Xa| >k for n>j,, neN, ken. 


(4.6) shows in the first place, that x»€ My. To see this, choose n> jg then 
log, | Xn — Xa > 0, so | Xn — Xq is finite. Also, x,€Mgy, so | Xn sp ™ for some 
positive integer m and | xq </Xp - Xn | +1x,|S2p7™<p +t”, xg E Mo. 

Now let « = Wx, then we wish to show that lim,., ,, «, = « or, which is equivalent, 
that 


(4.7) lim v(a%, — «) = 00. 


no 


But this is an immediate consequence of (4.6), since (4.6) implies 


1973] FUNCTION THEORY ON SOME NONARCHIMEDEAN FIELDS 97 


°(log, 


and this is the same as 


X_— Xq|)>k-1 forn>j, neN, keN 


u(a,-a)>k—1forn>j, neN, keN 


which is just an explicit expression for the validity of (4.7). Thus, we have shown that 
°R is complete. 
Let p = wp and consider any infinite series in °R of the form 


(4.8) Ap °+a,p'’+a,p°t+--, a,E RR, 
Vo < Vy < V2 <*> 0, 
where the v,; are standard real. The partial sums of (4.8) are 
6,=aAp° tap 't--+a,p", k =0,1,2,---. 


The value of any monomial in (4.8) is, for a,;#0, v(a;p"’) = v(a,) + v(p”’) 
=0+%,=,, with v(a,p"’) = 00 for a; = 0. Hence v(a,) = v; where j is the smallest 
subscript <k for which a, 40, if any, otherwise v(o,) = 00. Also, for OS k <1 


_ Vie + 1 ; =v 
O,— 0, = A413 “ +e + ap 


and so v(¢; — 6,) 2 v;4,. This shows that {o,} is a Cauchy sequence, and the limit 
of that sequence, o is just the sum of (4.8). Also, v(a) =v; where j is the lowest 
subscript for which a, 4 0 or, if there is no such j, 1.e., if all a; vanish, v(0) = 00 and 
this is the case if and only if o = 0. As usual in the theory of infinite series, we identify 
(4.8) with its sum in °R. It is then not difficult to verify that the sum of two numbers of 
’R, o and t, given by (4.8) and 


(4.9) bop’? + by, ph + bop + ---,b,ER &'R, 
Uo < Hy Seg <0 7, 
is represented by an expression 
Cop’ + cp" + cp? +>. 


which is obtained from (4.8) and (4.9) just as the sum > c,t** was obtained from 
da,t’* and b,t"* as elements of L in section 3 above. The product of (4.8) and (4.9) 
also is obtained by the procedure described in section 3, with p for t. It follows that 
the mapping 
(4.10) OM: aot’ +a,t' + a,t? +. ap tap '+ap’t+--, 

a,;ER, Wo < vy < v2 <-->, 


where the v, are standard real, is a homomorphism from L into ’R. This homo- 
morphism is an injection since ®« =0 implies ay = a, = a, =--- =0 (see above) 


98 ABRAHAM ROBINSON [June-July 


and, hence, a = 0. It follows that ®L is a field which is isomorphic to L and we write 
®L =’L. Evidently, ® is analytic (i.e., value preserving, v(®(«)) = v(«)). But ® is 
also order preserving, as can be shown by verifying that, for any «e L, Da > 0 if and 
only if « > 0. Now for « # 0, « > 0 if and only if the first nonvanishing a, is positive, 
so we only have to show that an expression as in (4.8), 6 = ag0'° + ayp"'+a,p"? + + 
is positive provided (without loss of generality) ag > 0. Now, we may write ¢ = wo, 
where 6 = a,p'° + T, (log, | t|) >v,. It follows that if v is an arbitrary standard 
real number between vy and v,,¥9 <v<v, then log,| t| > Vv, | z| <p’, ajyp”’> | t| 
and so 


= ap’ +t2Zap”—|t| > 0 
and hence, ¢ > 0. Thus ® is order preserving, as asserted. 


5. Functions in °R. Let f(x) be any real-valued function defined for a <x <b, 
a,beR. On passing to *R, f(x) is extended automatically to a function *f(x) which 
is defined for a < x <b in *R. We wish to find a natural extension of the function 
f(x) as we pass from R to °’R. 

Such an extension can be obtained, under certain conditions, as follows. Let € 
be any element of ’R between a and b, a<€ <b. Let W be the canonical homo- 
morphism from M, to °R as before (see (4.2) above). Then we define 


(5.1) (6) =wW*f(x)) for xew*t, a<x<b 


provided the expression on the right hand side of (5.1) is independent of the particular 
choice of x subject to the stated conditions (a < x < b, wx = 6). 

Suppose in particular that f(x) satisfies a Lipschitz condition in any closed 
subinterval of a < x < b. Thus, for any a <a’ < b’ < b there exists a k = k(a’,b’) 
such that for any a’ Sx,<x,S)’, 


(5.2) | f (x2) — f(x,)| Sk|x, — x,|. 


Passing from R to *R, we see that (5.2) still holds, for standard a’, b’ and for 
arbitrary x,, X2 in the interval <a’,b’>, if we affix a star to f(x,) and f(x,). In 
particular, it therefore holds for two points x,, x, of *R which are infinitely close to 
some standard Xo, a < Xq < b (where the constant k may depend on x,). 

Now let € €°R be infinitely close to x) € R. Then if x,, x, belong to wy ', both x, 
and x, are infinitely close to x9 in *R, and (5.2) applies for an appropriate standard k. 
But then x, — x, EM, and so, by (5.2), f(x.) —f(x,)€M,, wf(x,.) = Wf(x,). This 
shows that in this case, (5.1) provides a unique definition for °f(€). 

In particular, the Lipschitz condition is satisfied if f(x) has a continuous derivative 
for a< x <b or, more particularly, if f(x) is infinitely differentiable in that interval. 
Suppose that this is the case and consider the restriction of °f(x) to points 


E=agta,p'+a,p?t+-:, 0< vv, <n <-*, a<a,<b. 


1973] FUNCTION THEORY ON SOME NONARCHIMEDEAN FIELDS 99 


We may compare °f(x) for such a point with the function which is obtained by 
transferring Laugwitz’ definition from L to °L, i.e., with the function 


F(x) = ®("f(®" *x)). 


We propose to show that °f(x) actually coincides with F(x) for such argument 
values, 


(5.3) °f (2) = OCF(O*6)). 


In order to verify this identity, we observe that, except for rearrangements (which 
can be justified without difficulty within °R), the right hand side of (5.3) is simply 
the formal Taylor expansion in °R of f(x) about the point ay). Thus, our claim is that 


(©) =S(a0) + 2 (a,p" + a,p" +) + OM ea,p" + ap" + -) 


(5.4) 


{a ) —Vi =~v2 n 
a (Ai P't + agp? + to, 


Se 


in other words, that the Taylor series of °f about a) converges at €. Put 7 =a,p"' 
+a,p’?+--- and choose he w~'n, so that ag + he w '&. By Taylor’s formula with 
Lagrange’s remainder term 


*f '(@o) 
1! 


*f (ao) ; xp (nt Pao 4 Oh) n+ 
! (n + 1)! ‘ 


*f(ag +h) = *f(ag) + 24% p 4 LO pe 4. 


+ 


where 0 < 6 < 1. Now on the right hand side of this identity *f“(a,) = f“(a,) since 
ay is standard. Also, since f(x) is infinitely differentiable, f"+(x), and hence 
*f™+1(y) is bounded by a standard real number in any standard closed subinterval 
of <a, b> and hence, is bounded by a standard number B in the monad of ay. Hence 


FOF (ag + Oh) pyrtt 


er tee erm 


(n + 1)! 


n+1 


(5.5) < Bh 


Let v be any standard positive number less than v,. Then (5.5) together with the fact 
that v(y) = (log, | h | ) =v, shows that 


fag + Oh), ns 


(n+ 1)! < pee 


Hence 


n (k) 
*f (ao + h) _ ea £00) h* < pat yy 


100 ABRAHAM ROBINSON [June-July 


and so 
n (k) 
o( "70 -= f Meo) i) >(n + Dy. 


This shows that 


n k 
rg(©) = tim Fo) 
n7>o k=0 k} 
proving (5.4). 

The identity (5.3) is of interest in itself since it provides a natural justification of 
Laugwitz’ definition within a more comprehensive framework. Beyond that, by 
relating Laugwitz’ theory to that wider framework, we are able to make use of the 
resources of Nonstandard Analysis in order to provide satisfactory answers to 
several problems which were left open by Laugwitz. We shall turn to this task in 
our next section. 


6. The intermediate value theorem in L. In view of (5.3), the function °f(x), 
with values restricted to °L, behaves in exactly the same way as the function “f(x) 
on a corresponding interval in L. Consider the real valued function f(x) which is 
defined in the interval —-1<x <1 by 


e tI! for x #0, 
(6.1) f(x) = 


for x = 0. 


Then f(x) is infinitely differentiable in the entire interval of definition, including 
x =0. At that point f(x) =0 for n=0,1,2,---. 

Let x, =0, x2=4. Then °f(x,)=f(x,) =90, °f(x2) =f(x2) = 1/e?. If *f(x) 
satisfied the intermediate value theorem, there would exist a €e°L, 0 < € <4, such 
that °f(¢) = p. We shall show that there is no such €. 

Suppose first that € is infinitely close to 0, 


f= ayp + a,p™' + a,p > + cee, 0 < Vo < V1 <lieee yy CO, Ao > 0. 
Then, by (5.4) 


f (0) 
i! 


f"(0) 


°f(S) =f (0) + aT 


C+ ++ =0 
so ’f(€) cannot be equal to p. 

Suppose next that € is not infinitely close to 0. Then € = ay + n, where 0 < ay S 3, 
v(n) > 0 and so, by (5.4), *f(€) = f(a) + €, where v(€) > 0. This shows that ’f(€) is 
infinitely close to f(a), which is a standard real number different from 0, and so 
again °f(€) cannot be equal to p, which is infinitesimal. 


By contrast, if f(x) is continuous in an interval a < x < b and if the definition 


1973] FUNCTION THEORY ON SOME NONARCHIMEDEAN FIELDS 101 


(5.1) is effective in an interval x, S x S$ x, where a< x, <x, <5), X,, x, €°R, then 
the intermediate value theorem does apply in °R. That is to say, under these con- 
ditions: 


6.2. THEOREM. If °f(x,) <1 < °f(x2) for neE?’R, then. there exists a Ee °R, 
X1<&<x, such that *f(€) =n. 


To see this, we only have to choose elements of *R, x;,X,7' such that wx, = x,, 
WX> =X, Wn’ =n. Then *f(x;) <n’ < *f(x,) and so, by the intermediate value 
theorem for *f(x) there exists a €’e€*R, x; < €’ < x, such that *f(¢’) =n’. Putting 
€ =w' we then have °f(¢) = w(*f(é’)) = wn’ = n. This shows that the intermediate 
value theorem is satisfied in this case. 

For the remainder of this section, it will be our main purpose to show that the 
intermediate value theorem holds also in °L for functions °f(x) which are obtained 
from infinitely differentiable functions f(x) in R—and hence, holds also in L for the 
corresponding functions “f(x)—subject to rather mild restrictions, as follows. 


6.3. THEOREM. Let f(x) be a real-valued function which is defined and infinitely 
differentiable for a<x <b, a, bER and let °f(x) be defined by 5.1. Suppose that 
for every xE€R, a<x <b, there is a positive integer n such that f(x) #0. For 
any X1,X,€°L,a<x,< x, <), let ay and by be the uniquely defined elements of 
R which are infinitely close to x, and x, respectively, i.e., 


X, =a, +a ,p't+a,p7?+--, O0O<v,<v2<-+:-> 0, 
X, = bo + b,p"'+ bp"? 4+ --, 0< 1, <p. < °° 7, 


and suppose that a<aygSby<b. Let n be an element of °L such that 
*P (x1) <1 < °F (X2). 
Then there exists a &€°L, x, <& <x,, such that *f(€) =n. 


Proof. Comparing 6.3 with 6.2 (which applies to the situation described in 6.3) 
we see that we only have to show that the €€’R mentioned in 6.2 belongs more 
particularly to °L. Choosing x},x5,n' as in the proof of 6.2 such that wx; = x,, 
WxX> =X, wn’ =n we have, for some ¢’e*R, x, < €’ < x5, *f(€') =n’ and hence 
’f(C) = n where € = WE’. Now x; < &’ < x; implies that ¢’ is finite and has a standard 
part, °€’ = do, where a < dy S dy S by < b. At the same time, 7 must be of the form 
Co tep''+ ep? + --,0<A, <A, <-::700 since it is in °L and finite. Hence, 
°n' = ey and f(do) = ep. 

Suppose now that f’(d,) # 0. Then the inversion theorem is applicable. It follows 
that there exist h, > 0, h, > 0, k, > 0, k, > 0, such that f(x) is a one-to-one mapping 
of the interval D defined by dy —h, <x<dg+h, on the interval E defined by 
€g —k, <y<e ) +k, such that the inverse function g(y)=f  ‘(y) is infinitely 
differentiable on E. Passing to *R, we find that the infinitely differentiable function 
*f(x) maps *D in one-to-one correspondence on *E such that *g(y) is the inverse of 


102 ABRAHAM ROBINSON [June-July 


this mapping and is infinitely differentiable as well (in the sense of *R). Hence, 
*f(E') =n’ entails *g(n’) = ¢’ and so 
= Wo = W(*9(n")) = ?g(n) € PL, 
proving our assertion in this case. 
Dropping the restriction that f’(d)) #0 (but not excluding this case) we put 
F(x) = f(x) —f(d)) and define H(x) for a< x <b by 
F(x) 
H(x) = 4 *— 40 
F'(do) for x = do. 


for x # dy 


Also, on the assumption of our theorem, there is an n = 0 such that 


F(do) = F’(do) =e Fd) = 0, F&+)(do) # 0. 
Then F(x) = H(x) (x — d,), and so 
(6.4) F'(x) = H(x) + H'(x) (x — do) for x#d) 


and, more generally, 
(6.5) F(x) = kH@~ Y(x) + H(x) (x — do) 
for k = 1,2,---, x #dy, a< x <b, by induction. 
We now wish to show that, for x # do, 
FO*D(dy) ) FO) FO+*20q,) 
A+1 * 180A +2) 


FO*™(do) 
* On —1)!U+m) 


(6.6) Hx) = 0 (yx — do) oe 


(x — do)" * + Ginx — do)” 


provided 1 21, m 21, where G,,,, is a linear combination with rational coefficient 
of values of F¢*"* (x) taken at points x’ in the interval (do, x). 
For 24 = 1, we have the Taylor expansion for F’(x) 


F “o (x _ do) 4. 4 (x _ dy)™ 


F'(x) = F'(do) + 


FCT™(do + 81(% — do)) 


— d,ymti 

(m +1)! (x ~ do) 

where 0 S$ 6, $1, while the Taylor expansion for F(x) yields 

F’ d (1 +m) 
H(x) = F'(do) + (4o) (x —do) + + + Fe 40) ¢, — dg)” 
2! (m + 1)! 
(6.7) 

(2 +m) _ 

4 F (do + 8o(x — do)) (x — d)"*}, 


(m + 2)! 


1973] FUNCTION THEORY ON SOME NONARCHIMEDEAN FIELDS 103 


where 0 S 0, S$ 1. Hence, from (6.4), 


F'(x) — H(x) 
H'(x) = ————— 
©) (x — do) 
_ F'@do) FO*™(do) m= 1 m 
= a t+ Gaye mF 7 dy)" * + Gin(x — do)”, 
where 
G,., = FO+™M(dy + O4(% —do)) — FE*™(dy + Oo(x — do)) 


(m + 1)! (m + 2)! , 


as required. 

Suppose now that the assertion has been proved up to some / 2 1, for all m= 1. 
In order to prove the corresponding formula for 2 + 1, we write down the appropriate 
Taylor expansion for F¢* (x), so 


FOt2) d Fatm+ 1) d 
Fert (x) -_ Fat (do) + Heo) (x _ do) reo AAO) (5 _ do)” 
FOtm*Ng, + 0,4 1(x 7 do)) m+1 
+ ne DP 
where 0 < 0,4, <1. Then, by (6.5) and (6.6) (with m + 1 for m) 
HO+DG = FOTM( x) — A + DHO(X) 
x—- do 
FOt+20q,) Fatmt CP) _ : 
= ED FF Ge HE 1 my 40" + Gat ml do)", 
where 


G _ FOTm Fd, + 0,41(% — do)) 
A+ tim (m + 1)! 


—(A4 )Giim+1- 
This proves (6.6). We now obtain immediately, for 2 = 1 


Fat mC ) 
‘ (A) _ 0 
(6.8) lim HG) = 


and this is true also for 2 = 0, by (6.7). On the other hand, we may calculate the 
derivatives of H(x) at dy. We have, from (6.7), which is valid also for m = 0, 


H'(do) = lim 2D = HG) _ jig FOV A FAO) _ jigg F'do + 90( = 40) 
x-+do x — do x+do x — dg xdo 2 
_ F'(dy) 


2 


104 ABRAHAM ROBINSON [June-July 


where 0, may depend on x. Thus, H(x) has a continuous derivative everywhere in its 
interval of definition. 

Suppose now that we have proved that H(x) has continuous derivatives up to 
order A21 in the entire interval of definition a<x<b such that H™(d,) 
= F?+(q,) (A +1). We then make use of (6.6) for m = 2, where we observe that 
G,,2 (as a linear combination with fixed rational coefficients of values of F¢*» for 
arguments x’ in the interval ¢dy,x>) remains bounded in the neighborhood of xp. 
Hence, for such x, 


FGt (do) FOt+ 20(do) 


HOW) = 5 A +2 


(x — dy) + O(x — dy)? 


and so 


HO(x) — H(dg) _ F&*+?(do) 
MA Mo MOF 7 (A+1)(y), 
node x — do A+2 code a ©) 
This shows that H(x) possesses continuous derivatives of all orders in its interval 
of definition. In particular 


(6.9) H (do) = Fat 1)(q,) l(a 4 1), j= 0, 1, _ 
and so 


Fat (do) 


H(do) = H'(do) = =H" (do) = 0, Hd) = — 


# 0. 
If n > 0, we may repeat our procedure, obtaining from H(x) a function H,(x) in the 
same way in which we obtained H(x) from F(x). Thus, putting 


|(x — dg) = F(x) |(x — do)? for x#do 
~ UH'(dy) for x =do, 
we find that H,(x) is infinitely differentiable for a < x < b and 


FO* (do) 


H, (do) = Hi(do) = + = HY (do) = 0, HY" (do) = n(n + 1) 


# 0. 
Continuing in this way, we obtain after n — 1 more steps the function 


H,-1(%) (x — do) = F(x) (x — doy"*! = (ooge for x # do 
H,(x) = for nd ) 
TED = a 0, for x = do 


which is infinitely differentiable for a < x <b. 
Suppose first that n is even, n +1 is odd. Then the function w'/“*”, with 


1973] FUNCTION THEORY ON SOME NONARCHIMEDEAN FIELDS 105 


the determination that (H,(d,))‘/“*» be real, is infinitely differentiable in the 
neighborhood of H,(do) and so the function P(x) =(H,(x))'/“t is infinitely 
differentiable in some neighborhood of x = dg, for dg —h<x<dy) +h, say. The 
function 


Q(x) = P(x) (x — do) = (f(x) — (do) 4"? 
therefore is also infinitely differentiable in the same interval, and 
Q"(x) = P(x) + P(x) (x — do), Q"(do) = P(do) # 0. 
Passing to *R we see that, for x = ¢’, 


*O(E) = (FE) —F (do)? = Cn" = eo). 


Hence 


PO(E) = W((n' — e0) °F?) = (9 — ey) OY EPL 


since L, and hence °L, is real closed (see section 3 above). Hence, applying the 
inversion theorem to Q(x) at x = dy (exactly as we applied it earlier to f(x) on the 
assumption that f’(d)) #0), and letting S(y) be the inverse function to Q(x) at 
x =dy, y=0, we obtain 


& = We! = WSC! = 00) PY) = Sq — eg) PL. 


This disposes of the case that n is even. 

Suppose finally that n is odd, n+ 1 is even. We may assume without loss of 
generality that H,(d)) =f“* (do) /(n + 1)! is positive, otherwise we consider — f(x) 
in place of f(x). Then f(x) — f(do) = H,(x) (x — dy)"**! must be positive, for x # do 
in a sufficiently small neighborhood of do. Introducing P(x) = (H,(x))'/“* with the 
postiive determination for (H,(d,))'/“*", and Q(x) = P(x) (x — do) we then have 
again that P(x) and Q(x) are infinitely differentiable in a neighborhood of x = do, 
and that Q’(dy) = P(dy) # 0. Also, | 


*O(E') = £ (FE) — fo)? = £1’ — eg)" 


leading to °Q(€) = +(n —e,)'/"*, which is again an element of °L. Finally, 
introducing the inverse function S(y) of Q(x) with S(O) = do, as before, we have 


EWE! =U (SC (1 — 0)" P)) = SC (1 — 00) ) 6 PL. 


The proof of Theorem 6.3 is now complete. 

Although the counterexample given at the beginning of the section shows that 
some restriction on the behavior of the derivatives of f(x) is required, the particular 
set of conditions given in 6.3, is not strictly necessary. Thus, if f(x) =const., then 
the conditions of the theorem are not satisfied but its conclusion is, trivially. Never- 
theless, 6.3 includes a large number of interesting cases, e.g., all non-constant real 
analytic functions f(x). 


106 ABRAHAM ROBINSON [June-July 


7. The mean value theorem. Suppose the function f(x) is continuously dif- 
ferentiable for a < x < b. Let D be the set of points € €°R such that € is infinitely 
close to a point ay in the interior of that interval, a < ag < b. Then f’(x) is bounded 
in any closed subinterval a’ <x <b’ of a< x < band so f(x) satisfies a Lipschitz 
condition in that interval. Taking a’ < dg < b’ we see, therefore, that the definition 
5.1 is effective. We claim, moreover, that the resulting function °f(x) is continuous, 
in the sense of the metric of °R, at all points € eD. 

To see this, let {€,} be a sequence of elements of D such that lim,.,, €,= € and 
choose a number &’ and a sequence {f/} in *R such that wé'’=€, WE; =&, n=0, 
1,2,---. Since lim €, = €, there exist a’, b’E R such that a’ <€ <b’, a'<€'<b’, 
a'<&,<b',a'<€&, <b’, n=0,1,2,---. Let m be a bound for f’(x) in the closed 
interval a’ <x <b’ within R and, hence within *R. Then 


*f(Cn) — *F(C') = FC" + OE, — 6) (S, — 6’) 
for some 0 < 6 <1 and, hence 
|*f(E.) — F(O| Sm] é, — |. 


This, together with lim €, = € implies lim°f(¢,) = °f(€), proving our assertion. 

Suppose next that f(x) is twice continuously differentiable for a < x < b. In this 
case, we propose to show that °f(x) is differentiable in D (in the sense of the metric 
of °R) and that on D, 


(7.1) £ f(x) = *(F'@). 


For € in D and yn # O such that & + n also belongs to D, choose &’ and n’ for which 
wé’ = &, wy’ =n. Then there exists a 0’€ *R, 0S 0’ $1, such that 


f(' +) -— fC) 


— {'(F!' Q’'n'). 
. f'(E' + 0'n') 


(7.2) 


Applying the mapping w to (7.2), we obtain 


(E+ n— FO _ 
n 


where 0 = WO’. Now let 7 tend to zero. Then the right hand side of (7.3) tends to 
(°f '(x)), <2 since °f’(x) is continuous on D. This proves (7.1). 

In particular, if f(x) is infinitely differentiable, then °f(¢) and (°f'(x)), .. belong 
to °L for € € °L. It follows that in that case ?f(x) is defined and infinitely differentiable 
in DO*’ L. Accordingly, the same is true of the function 


(7.3) CPF" (X))x=e+0n 


“f(x) for x =ay ta,t’' + at”? +--+, O<v, <9, < + 3 0, a<dy) <b. 


(7.3), in combination with (7.1) shows also that the mean value theorem holds in 


1973] FUNCTION THEORY ON SOME NONARCHIMEDEAN FIELDS 107 


PR under the stated conditions, more particularly for infinitely differentiable f(x). 
However, here again we may show that the mean value theorem breaks down, for 
certain infinitely differentiable functions, both in °L and in L. The function (6.1) 
which provided an example for the breakdown of the intermediate value theorem, 
will do also for the present issue as can be seen by considering the ratio of increments 
(f (Ex) — f(Ex) Ex — &1) for &, =1/(2 +p), &, = —4. There is no £;€°L in the 
closed interval from ¢€, to €, such that (°f(x))’ is equal to that ratio for x = &3. 
We shall prove, as our principal positive result in this area: 


7.4. THEOREM. Let f(x) be a real valued function which is defined and in- 
finitely differentiable fora <x <b; a, be Rand let °f(x) be defined by 5.1. Suppose 
that for every x, a <x <b, there is an integer n = 2 such that f™(x) # 0. For any 
X1,X2E°L, a< x, <X2<), let ag and bo be the uniquely defined elements of R 
which are infinitely close to x, and x, respectively, i.e., 


X, =Ag tap +anprt+---, O< vy <v, <+:+3 00, 
XxX», = bo + b,pei+ bp"? + mee, QO< Ly < Lo <3 OW, 


and suppose that a< ag bo <b. 
Then there exists a €E€°L, x, S&€ Sx, such that 


fea) — He) _ (4 710) 


X2 — X, dx 


x= 


Here again there is an exactly corresponding theorem for the function “f(x) in L. 
The conditions of the theorem are not necessary since they exclude all functions of 
constant slope, for which the conclusion of the theorem is obviously correct. However, 
the theorem is nevertheless of a rather general character, including, for example, all 
other real analytic functions. 

For the proof, we require the following auxiliary consideration. 

Assume that the conditions of (7.4) are satisfied and choose x; ew 'x,, x, EW 'x. 
Then we claim that *f’(x) attains its maximum in the interval x; Sx <x} either 
at x, or at x2 or at some standard point Xo, x; SX9 S Xz (but, possibly, also 
elsewhere). 

Suppose that *f(x) attains its maximum neither at x; nor at x, but at a point 
X, xX, < ¥ <x}. Let xq be the standard part of x, x9 = °x. Suppose that x9 < x, (so 
that x9 = a,). Depending on whether the first non-vanishing derivative of f’(x) at 
Xo is either positive or negative, *f’(x) will be either strictly increasing or strictly 
decreasing in some interval x» Sx SX j+h, where h is standard and positive. 
Since X and x; belong to that interval, the latter case would involve *f'(x;) > *f'(x), 
contrary to our choice of x. Accordingly, we have to assume that *f’(x) increases 
strictly for x» Sx Sx +h. Now x, cannot belong to that interval for then 
*f'(x5) > *f'(x), which is again impossible. It follows that X<x9+h< x, and 


108 ABRAHAM ROBINSON [June-July 


*f'(x9 + h) > f(X) which is also impossible. We therefore conclude that x 9 = x, 
and, by similar reasoning, x9 S x2. The discussion of the variation of *f’(x) in the 
neighborhood of xy shows that we must exclude both xy < X and x9 > X and so we 
conclude that X = Xp. 

Thus we have shown than *f’(x) attains its maximum at x, or at x5 or at some 
standard point x; <x 9 <x, (although several of these cases may occur simul- 
taneously). Accordingly *f’(x) attains its maximum in the interval x; Sx <x, in 
all cases at a point £, such that wl, = ¢, €°L. By a similar argument (or, by apply- 
ing the conclusion to — f(x)) we find that *f’(x) attains its minimum inthe same 
interval at a point C, such that WC, =C, e°L. Passing from *R to °R, we then con- 
clude that °(/'(x)) attains its maximum and minimum in x, Sx Sx, at points €, 
and ¢, which belong to °L. 

By a well-known formula of the Integral Calculus, which can be transferred from 
R to *R, we have 


‘ 


*£'(C) (x5 — x1) S | *f'(xdx < *f'(C)(x} — x}, 


1.e., 
*f'(C2) (x2 — x1) S *F(%2) — *F (01) S *F'(04) (2 — 1). 


We apply the mapping w to this chain of inequalities and obtain 


"(FS 2)) (%2 — X1) S PF (%2) — PF (1) S PCPs) (2 — 1) 


or, equivalently 


°(F'C2)) S 


*E(X2) — "FX eS op pr 
a (F'(¢;)). 


X2 


But this shows that (°f(x,) — °f(x,)) /(x2. — x,) is intermediate between °(f’(€,)) and 
°(f'(C,)). Hence, by the intermediate value Theorem 6.3, there exists a € € °L which 
belongs to the closed interval with endpoint €, and ¢€, and, hence, belongs to 
X, SX ZxX,, such that 


Pf (x2) — PF(X1) 
x4 


x2 — 


= °(f'(9)) 


and this is the same as 


*f(X2) — F(x) _ (ad, 
x, — 1 7 (ar JO) )e 


by (7.1). The proof of 7.4 is now complete. 


1973] FUNCTION THEORY ON SOME NONARCHIMEDEAN FIELDS 109 


/ 8. Conclusion. As Laugwitz points out, his method for extending a function 
(x) from R to L applies only in the infinitesimal neighborhood of a point at which 
f(x) is infinitely differentiable and hence, possesses at least a formal Taylor series. 
However, if we consider points in the infinitesimal neighborhood of the endpoints of 
the interval of definition a<x<b of f(x), eg, x=ata,t'+at?t+-, 
0<v, <v,<-:, a, >0, then we can still define “f at x, provided f possesses an 
asymptotic expansion at x =a as x tends to a from the right. Similarly, if f(x) is 
defined in a semi-infinite interval, for x >a say, we can define “f(x) for positive 
infinite x provided f(x) possesses an asymptotic expansion as x > + oo. In all of 
these cases, “f(x) can again be obtained ‘‘automatically’’ as ®~ 1(*f(@x)) (see (5.3) 
above). However, °f(x) exists also in many cases where no asymptotic expansion as a 
generalized power series is available, e.g., "log x exists for positive infinitesimal and 
infinite x. Conversely, we may investigate the asymptotic expansion of a function 
f(x) at a singular point (even when it contains logarithmic terms, as happens 
frequently in the theory of ordinary differential equations) by means of the function 
’f(x). Going further in the direction of concrete applications, °R also provides us 
with a convenient framework for the discussion of matched asymptotic expansions 
for the solution of singular perturbation problems. 


This research was supported in part by the National Science Foundation Grant No. GP-18728. 


References 


1, N. Jacobson, Lectures in Abstract Algebra, vol. III, Princeton-Toronto-New-York-London, 
1964. 

2. D. Laugwitz, Eine nichtarchimedische Erweiterung angeordneter KGrper, Math. Nachr., 37 
(1968) 225-236. 

3. T. Levi-Civita, Sugli infiniti ed infinitesimi attuali quali elementi analitici (1892-1893), Opere 
matematiche, vol. 1, Bologna 1954, pp. 1-39. 

4. W. A. J. Luxemburg, What is Nonstandard Analysis, California Institute of Technology, 
1968, to be published. 

5. A. Ostrowski, Untersuchungen zur arithmetischen Theorie der Kérper, Math. Z., 39 (1935) 
269-404. 

6. A. Robinson, Non-standard Analysis, Studies in Logic and the Foundations of Mathematics, 
Amsterdam, 1966. 

7. B. L. v. der Waerden, Algebra, 5th edition, Berlin-Heidelberg-New York, 1966/1967. 

8. O. Zariski and P. Samuel, Commutative Algebra, vol. 2, Princeton-Toronto-New-York- 
London, 1960. 


Yale University, 
September 1970. 


THE AMERICAN 


MATHEMATICAL MONTHLY 


(FOUNDED IN 1894 BY BENJAMIN F. FINKEL) 
THE OFFICIAL JOURNAL OF 


THE MATHEMATICAL ASSOCIATION OF AMERICA 


VOLUME 80 

CONTENTS 
Computer Algebra of Polynomials and Rational Functions . . G. E. COLLINS 
On the Discrete Version of Wirtinger’s Inequality . . . . . .  O. SHISHA 
Current Trends in Algebra . . . . . . . +. ~~. ~~. GARRETT BIRKHOFF 


MATHEMATICAL NOTES 
Existence of Four Concurrent Normals to a Smooth Closed Hypersurface of E” 
Loe BERND WEGNER 
On a Problem of Besicovitch . toe . . . . .  . B. FISHER 
Topological Properties of the Row Echelon Form . . . « .  G. P. BARKER 
Two-dimensional Complete Monotonicity with Diagonalization C.H. KIMBERLING 


RESEARCH PROBLEMS 
The Permanent of a Doubly Stochastic Matrix. . . . . RUSSELL MERRIS 


CLASSROOM NOTES 
A Generalization of a Theorem of Archimedes. ... . _. WALTER RUDIN 


The Chromatic Polynomial of a Complete Bipartite Graph . .  J..R. SWENSON 


MATHEMATICAL EDUCATION 


Training Secondary Mathematics Teachers in Venezuela . . . D.B. AICHELE 
Experiences with Lectures on the History of Mathematicsin Utrecht A.F. MONNA 
Developing Countries: A Rejoinder . . . . . . . +. M.A. B. DEAKIN 


ELEMENTARY PROBLEMS AND SOLUTIONS 
ADVANCED PROBLEMS AND SOLUTIONS . 


(Continued on inside cover) 


NUMBER 7 


CODEN: AMMYAE 


791 


794 
797 


798 
803 
806 


807 
814 


AUGUST-SEPTEMBER 


1973 


REVIEWS . . .. 2. 0. eee eee ee ee ee eee «BD 


NEWS AND NOTICES... Soe ee 839 

MATHEMATICAL ASSOCIATION ( OF AMERICA . tee 841 
MAA Publishes Guidelines for Evaluating College Mathematics Programs Looe 841 
Disability Income Plan Added to the MAA Group Insurance Program. . . . 844 
November Meeting of the Indiana Section . . . . . . . . . et 844 
February Meeting of the Louisiana-Mississippi Section. . . . . . . . . 845 
February Meeting of the Northern California Section. . . . . . . . . 846 
March Meeting of the Florida Section . . . . . . . . Oe 847 
New Sectional Governors of the Association. . . . . . . . . . 848 
Announcement of Lester R. Ford Awards . .. . re 849 
The 1973 William Lowell Putnam Mathematical Competition re 849 
Films Produced by the MAA . . . . . . 849 
Calendars of Future Meetings . . . . . . . ee, 852 


NOTICE TO AUTHORS 


Specialized research is usually unsuitable; see Statement of Policy (vol. 76, p.2). Manuscript preparation: Please 
use the Manual for Monthly Authors (vol. 78, p. 1) and follow the format in current issues of the MONTHLY. 
Manuscripts should be typewritten, triple-spaced with wide margins; submit two copies and keep one for 
protection against loss. 

Backlog: Main Articles 12 months, Math. Notes 13 months, Research Problems 7 months, Classroom Notes 
11 months, Math. Education 10 months. 


EDITORIAL CORRESPONDENCE AND MAIN ARTICLES: to ALEx ROSENBERG, Department of Mathe- 
matics, Cornell University, Ithaca, N.Y. 14850; NOTES, etc.: to the corresponding Associate Editor; 
ADVERTISING CORRESPONDENCE: to RAouL HAILPERN, Mathematical Association of America, 
SUNY at Buffalo, Buffalo, N. Y. 14214; CHANGE OF ADDRESS and SUBSCRIPTIONS: to A. B. 
WILLCOx, Mathematical Association of America, 1225 Connecticut Ave., N.W., Washington, D.C. 20036. 


HARLEY FLANDERS, Editor 
ALEX ROSENBERG, Editor- Elect 
ASSOCIATE EDITORS 


JOSHUA BARLAZ J. G. HARVEY SEYMOUR SCHUSTER 

E, R. BERLEKAMP ERIC S. LANGFORD J. ARTHUR SEEBACH, Jr. 
JANE W. DI PAOLA P, D. LAX FE. P. STARKE 

ROBERT GILMER ARTHUR MATTUCK LYNN A. STEEN 
RICHARD GUY M. W. POWNALL JAMES WENDEL 

RAOUL HAILPERN GIAN-CARLO ROTA 


Annual dues for members of the Association (including a subscription to the American 
Mathematical Monthly) are $12.50. For nonmembers the subscription price is $18.00. 


PUBLISHED BY THE ASSOCIATION at Washington, D. C., and Menasha, Wisconsin, during the months of January, 
February, March, April, May, June-July, August -September, October, November, December. 


Second-class postage paid at Washington, D. C., and additional mailing offices. 
Copyright © The Mathematical Association of America (Incorporated), 1973 


PRINTED IN THE UNITED STATES OF AMERICA 


COMPUTER ALGEBRA OF POLYNOMIALS AND 
RATIONAL FUNCTIONS 


G. E. COLLINS, University of Wisconsin 


1. Introduction. Computer programs are now available for performing many 
important algebraic processes of interest and utility to pure and applied mathemati- 
cians. Also, many interesting mathematical problems arise in the development and 
analysis of algorithms for use in such programs. As a mathematician-turned-computer- 
scientist who has contributed to realizing the current state of computer algebra 
capabilities and who is intrigued with the mathematical problems whose solutions 
will contribute to further progress, I hope in this brief survey article to impart some 
of my knowledge about this subject and to transmit some of my enthusiasm for it 
to other mathematicians. 

The kind of computer algebra I will discuss is concerned mainly with polynomials 
and rational functions, for the following reasons. First, the allotted space does 
not permit me to be more ambitious. Second, this is the only part of the subject 
in which I can really be authoritative. Third, this is the part of the subject about 
which the most is known and on which other parts most depend. 

I shall be concerned with polynomials and rational functions in several variables, 
primarily with rational integer coefficients, but this will also lead to consideration 
of rational number coefficients and finite field coefficients; algebraic number coeff- 
cients will receive some mention as an advanced topic of current research. 

Operations on polynomials and rational functions which will be considered 
include the ‘‘arithmetic’’ operations of addition, subtraction, multiplication and 
division. I shall show in Section 3, perhaps to the reader’s surprise, that even poly- 
nomial arithmetic is non-trivial and interesting when the objective is to design 
and analyze optimal algorithms. 

Algorithms for the arithmetic operations on rational functions require an efficient 
algorithm for polynomial g.c.d. (greatest common divisor) calculation. Research 
over the last seven years has revealed that there are numerous versions of the ‘‘Euc- 
lidean algorithm’? for polynomials which differ dramatically in their efficiency. 
Also, within the last four years, ‘‘modular’’ polynomial g.c.d. algorithms have been 
devised, which depend on use of the Chinese remainder theorem, and which are 
orders of magnitudes faster than any of the non-modular Euclidean algorithms. 


George Collins earned his Ph.D. at Cornell University, under J. Barkley Rosser. He has served 
at the IBM Scientific Computing Center, New York, IBM Project Vanguard, New York, and Wash- 
ington D. C., Mathematics and Applications Department, White Plains, New York, IBM Research 
Center, Yorktown Heights, Cal. Tech., and the University of Wisconsin, where he has recently been 
the chairman of the Computer Sciences Department. He has just completed a leave of absence as a 
visiting professor at Stanford University. He has published extensively in Computer Science and is 
preparing a two volume book called Algebraic Algorithms. Editor. 


725 


726 G. FE. COLLINS [September 


Furthermore, an intimate relationship has been established between polynomial 
g.c.d. calculation and polynomial resultant calculation. Hence all of Section 4 1s 
devoted to this remarkable success story. 

Section 5 is devoted to algorithms for polynomial factorization. This is a more 
difficult problem than polynomial g.c.d. calculation, but the progress of the last 
five years has been equally remarkable. Here again ‘“‘modular’’ algorithms have 
been developed, which reduce the problem of factoring a polynomial with integer 
coefficients to one of factoring a polynomial with finite field coefficients. This time 
however, Hensel’s p-adic lemma takes the place of the Chinese remainder theorem, 
and there is no classical counterpart of the Euclidean algorithm. Instead, reliance 
must be placed on some important new algorithms due to Berlekamp for factoring 
polynomials in one variable over a finite field. Polynomial g.c.d. calculation also 
plays an important role. 

Section 6 covers a variety of subjects worthy of mention including polynomials 
with Gaussian integer coefficients, operations on rational functions, including inte- 
gration, linear algebra over polynomials and rational functions, computing zeros 
of polynomials using exact arithmetic and algebra, and calculations with algebraic 
numbers. 

Throughout, we make provisions for polynomials with arbitrarily large coef- 
ficients. This is natural since computers are now fast enough that any restrictions 
imposed by the word length of the computer are artificial and unnecessary. It is also 
important since only the most trivial algebraic operations on polynomials can be 
performed without generating integer coefficients which are 100 or more decimal 
digits in length. Frequently the final result will have coefficients of modest size, but 
obtaining this result requires the generation of polynomials with very much larger 
coefficients. Thus, Section 2 is devoted to algorithms and computer techniques for 
arithmetic operations and g.c.d. calculations on ‘“‘infinite-precision’’ integers. 


Section 2 also provides an introduction, for the non-computer scientist, to the 
subject of the analysis of algorithms, especially the analysis of the computing time 
of an algorithm at a level which is independent of any particular computer that 
might be used to perform the algorithm. There is a growing conviction among 
many computer scientists that the analysis of algorithms, or the study of ‘‘compu- 
tational complexity,’’ is the most important and fundamental part of computer 
science —if not the only part. Although I share in the enthusiasm of this viewpoint, 
I believe that it overlooks the equally important role of the computer scientist in 
discovering, designing and synthesizing new algorithms. 


[ hope that readers of this article will find it interesting or useful in at least one 
of the following ways. First, it may create an awareness of the computer programs 
and systems for computer algebra which are now available and which might be use- 
ful as tools in conducting some mathematical research of either a pure or applied 
nature. Second, the reader may be interested in the mathematical foundations of 


1973] COMPUTER ALGEBRA OF POLYNOMIALS AND RATIONAL FUNCTIONS 727 


the algorithms which are discussed. Third, the reader may be interested in the mathe- 
matical methods and problems which arise in attempting to analyze the computing 
times of these algorithms. 


2. Infinite-precision integer arithmetic. One cannot go very far in performing 
operations on polynomials with rational integer coefficients (hereafter called integral 
polynomials) without quickly generating polynomials with very large coefficients. 
Most current computers can directly perform arithmetic on integers no larger than 
about 10 decimal digits, whereas integers up to several hundred decimal digits are 
common in various algebraic calculations. The obvious solution is to teach the 
computer (via subprograms—i.e., auxiliary programs used by other programs) to 
perform base f# arithmetic by the well-known ‘“‘classical’’ methods, where B is a 
‘‘parameter’’ chosen to suit a particular computer. Thus £ will typically be about 
10'°, but if the computer is itself constructed to perform binary (i.e., base 2) arith- 
metic, as most are, then there will usually be some advantage in choosing a power 
of 2 for B. Each “‘digit’’ in the base f representation of a large integer is then stored 
in a separate memory location or ‘‘word’’ of the computer. 

The only real problem which arises in this scheme is one of computer memory 
allocation or, as we also say, storage allocation. Such memory allocation must 
be done ‘‘dynamically,”’ i.e., when the computer program is executed rather than 
when the program is written. This is because a program variable may take on a 
sequence of integers of very different lengths during the course ofrunning the program 
just once, and it would be very wasteful to allow enough memory for the largest 
integer which might occur, even if this were possible to predict. The problem is 
further compounded if we are working with a polynomial some of whose coefficients 
may be much larger than others (and many of the coefficients may even be zero). 

Several satisfactory solutions have been devised for the dynamic storage allo- 
cation problem, but the first discovered, the most elegant, and the most universally 
applicable solution is that of lists. Suppose, for illustrative purposes, that B = 10 
and that we wish to store the number 352. We can store the three digits of this number 
in any three available words of the computer memory, provided we also store in 
the first word the location, or address, of the second word, in the second the address 
of the third, and in the third an indication, say some standard fictitious address, that 
there are no further words in this list of words. Since the choice of the three words 
is insignificant, we represent this diagrammatically as follows: 


C-ED-ED 


We say that these three words comprise a representation of the list (3, 5, 2) and that 
the location of this list is the location of the first word. Each word in the list contains 
two fields: an element field (left half) and a successor field (right half). We say that 
the list (3, 5, 2) represents the integer 352. 


728 G. E. COLLINS [September 


At any given time, all of the words in some designated portion of the computer 
memory which are not in use as part of some such data list are themselves all linked 
together, in arbitrary order, as an available space list. When a new word is needed 
to construct some data list, the first word is unlinked from the available space list 
and linked to the data list. When a data list is no longer needed to complete a com- 
putation, its words or cells are linked to the head of the availabie space list. 


These basic concepts of list processing were first set forth in 1957 in a paper, 
[51], by A. Newell, J. C. Shaw and H. A. Simon. John McCarthy in 1960, [46], 
devised an important programming system and language, LISP, for list processing 
which permits overlapping of lists (a single cell can occur in several data lists) and 
which automatically reclaims most data lists which are no longer needed (the recla- 
mation is called garbage collection). Collins in 1960, [10], devised an alternative 
scheme for overlapping using reference counts. A computer program system for 
infinite-precision integer arithmetic using list processing was first described by Collins 
in 1966, [11], although the system existed as early as 1961. We shall discuss list 
processing further in connection with polynomials. An in-depth treatment of list 
processing principles is to be found in the book [39] by D. E. Knuth. 


Classical methods for base f arithmetic, as taught in the elementary schools 
for B = 10, are sufficiently well specified and efficient for computer algorithms, 
except for division. In division the method specified for generating the successive 
digits of the quotient involves some human judgement, which must be eliminated 
from a computer algorithm. In 1960, D. A. Pope and M. L. Stein, [54], proposed 
the following algorithm. We first normalize the divisor, multiplying both, it and the 
dividend, by a positive integer such that the leading digit of the new divisor is at least 
[8/2], the integral part of 8/2. For this purpose we can use [ B/(b, + 1)] as multiplier, 
where b, is the leading digit of the original divisor. Now let B = X3.,),p' 
be the normalized divisor, Q = LXf~o9q,6' the desired quotient. Assume q,,°--, q jt 
have already been determined and let A = D"*4*'a,6 be the current remainder. 
Then q; is approximated by 4; = [(4,4;412 + 4n4,;)/b,] unless a,4;4,; = 5,, in 
which case g; = B—1. Pope and Stein showed that in all cases 0 S g, — q; S 2. 
Thus at most two corrections are required to obtain q, and the necessity of a cor- 
rection is always characterized by a negative remainder. A 1969 analysis by Collins 
and D. R. Musser [25], shows that, except for very small 8, g;—q, has the values 
0, 1 and 2 with probabilities of about .674, .318 and .008, respectively. D. E. Knuth, 
1969 ([40], Section 4.3.1) has shown how to make a simple correction to q;, de- 
pending also on a,,,-, and b,_,, obtaining gj such that 0 < q;*—q,; <1 and 
q; —q,; = 1 with probability not exceeding 3/f. 

Any algorithm ./ has a certain well-defined set J, of valid inputs (the elements 
of I,, may be n-tuples). If the algorithm is initiated with a valid input x, the al- 
gorithm performs a certain finite number, t,(x), of primitive actions (which may 
be thought of as the instructions of a real computer) and finally stops, producing 


1973] COMPUTER ALGEBRA OF POLYNOMIALS AND RATIONAL FUNCTIONS 729 


some output (which may be an mz-tuple). t,, is the computing time function of the 
algorithm 7. 

The computing time function of an algorithm obviously depends on the specific 
definition of a primitive action (i.e., on the particular computer on which the al- 
gorithm is implemented) and on various other uninteresting and inessential aspects 
of the algorithm, so we are interested only in the codominance equivalence class 
of the function in the following sense. 

If f and g are two real-valued functions defined on a set S we say that f is domi- 
nated by g, or g dominates f, and write f < g in case there is a positive real number 
c such that f(x) S cg(x) for allxeS. If f<g and g <f, we say that f and g are 
codominant, and write f ~ g. If f< g but not g =< f we say that f is strictly domin- 
ated by g and write f ~ g. 

The f-length of a non-zero integer a, L,(a), is defined as the number of digits 
in its base 6 representation; the /-length of zero is defined to be 1. It is easy to verify 
that if y is any other base, then L, ~ L, so we shall often just write L(a) or refer 
to the length of a. 

Now if M is the classical algorithm for multiplication of two infinite-precision 
integers, it is easy to verify, and intuitively obvious, thatt,,(a, b)~ L(a) : L(b)fora #0 
and b # Q. In particular, this means we can multiply any two n-digit numbers in 
at most n* units of time. It was not until 1962 that a faster integer multiplication 
algorithm was discovered, when A. Karatsuba proposed an algorithm with comput- 
ing time dominated by ,log,3 (log,3 = 1.58---). Karatsuba’s algorithm is the 
first in an infinite sequence of algorithms M,,M,,---, the computing time of M, 
being dominated by n**, where e, = log,,,(2k +1). Thus for every e>0 there 
is a multiplication algorithm with computing time dominated by n‘*°. Still 
more recently, multiplication algorithms have been discovered by S. A. Cook and by 
A. Schonhage which are faster than any of the M,. For an excellent description 
and analysis of these integer multiplication algorithms, see [40], Section 4.3. 

Unfortunately, these “‘fast’’ multiplication algorithms are faster only for larger 
integers than normally arise in most algebraic calculations. James R. Pinkert found 
[52], that M, became faster than M only when the inputs had f-lengths greater 
than 55 (about 550 decimal digits). Subsequent discussions of computing times in 
this article will therefore be based on the use of classical algorithms for integer 
arithmetic. 

Of course, the time to add or subtract two integers, a and )D, is 


< max(L(a), L(b)) ~ L(a) + L(b). 
The time to divide a by b using the classical algorithm, obtaining a quotient g and 
remainder r, 1s 


~ L(b): L(q) ~ L(b): (L(a) — L(6) + 1) for |a| = |b] > 0. 


The classical algorithm for computing the g.c.d. of two integers is the familiar 


730 G. E. COLLINS [September 


Euclidean algorithm. It can be shown, [15], that the time to compute c = gcd(a, b), 
a = b> 0, by the Euclidean algorithm is < L(b) - L(a/c) ~ L(b) « (L(a) — L(c) +1). 

D. H. Lehmer [43], has devised a version of the Euclidean algorithm which 
is faster for large integers. Lehmer’s version has the same computing time function 
as the ordinary Euclidean algorithm, to within codominance, but it is about ten 
times as fast on a typical computer when the inputs are longer than one /-digit. 
Knuth [41], has recently discovered an integer g.c.d. algorithm whose computing 
time, for inputs of length n, is dominated by n’*° for every e > 0. But this algorithm 
also appears impractical for integers of the sizes which commonly occur in algebraic 
calculations, and we shall not assume its use. 

Instead of studying the computing time function of an algorithm directly, it is 
often more enlightening to study certain related functions. Consider, for example, 
the Euclidean algorithm E, and let S(m,n, k) be the set of all pairs of integers (a, b) 
such that |a| = |b! >0, L(a) = m, L(b) = n, and L(c) = k, where c = gcd(a, b). 
The sets S(m,n, k) are finite, disjoint, and together contain all valid inputs to E. If 


tr(m,n,k) = max{t,(a, b): (a, b) ¢ S(m,n, k)}, 


t’. is the maximum computing time function for E. The result about t, quoted 
above can be restated as tz(m,n,k) < n(m—k + 1). It can, in fact, be shown [15], 
that tz(m,n,k) ~ n(m—k + 1). Analogously, we can define 2 minimum computing 
time function t; for E.Itis proved in [15] that t; (m,n,k) ~n(m—n4+1)+k(n—k+1). 
Note that, therefore, tj <t;. An average computing time function for E is de- 
fined by 

tr(m,n,k) = & {t,(a, b): (a, b) € S(m, n, k)}/card(S(m, n,k)) 


(card(S) is the cardinality of the set S). It is a much deeper result of [15], proved 
with the aid of a recent result of Dixon [27], that t% ~ t;. Dixon’s result pertains 
to the average number of divisions in the Euclidean algorithm, a subject which is 
treated at length by Knuth in [40], Section 4.5.3. 

Among the various program systems which are now available for computer 
algebra, only a few provide infinite-precision integer arithmetic. Among these, 
one of the most readily available is SAC-1 since it is programmed, with the exception 
of a few simple “‘primitive’’ subprograms, entirely in the standardized Fortran IV 
language [1]. SAC-1 is organized in modular form, with different modules or ‘‘sub- 
systems’’ providing different capabilities. There are currently ten modules, of 
which the second [17], is for integer arithmetic while the first [16], provides a list, 
processing capability required by all other modules. The SAC-1 system is currently 
in use at some SO institutions throughout the United States and in several other 
countries on computers of half a dozen major manufacturers. Other systems which 
provide infinite-precision integer arithmetic are Reduce 2 [31], PL/1-FORMAC 
(up to 2295 decimal digits!) [56], and SCRATCHPAD/1 [29]. Of these, only the 
former is available on non-IBM computers. 


1973] COMPUTER ALGEBRA OF POLYNOMIALS AND RATIONAL FUNCTIONS 731 


Codominance functions for the computing times are deliberately devoid of any 
constants and must be supplemented for practical purposes with empirically ob- 
served computing times for particular cases. Table | gives sample computing times 
for integer addition, multiplication, division and g.c.d. calculation. A, B and C 
are 100-decimal-digit integers, while D and E are 200 decimal digits long. Times, 
in seconds, are for the SAC—1 system on a UNIVAC 1108 computer. More extensive 
tables and formulas for empirical computing times are given in [17]. 


TABLE 1. Computing Times for Integer Arithmetic 


3. Polynomial arithmetic. We have seen above how to store in the computer 
any list (@,,d@,,-::,a,) in which each a; is a “‘small integer’’ (1.e., small enough to 
be stored in one memory location). We now have a need to consider more general 
lists. Let S be a set of small integers (e.g., S = {a:|a| < B}). A list over S is defined 
recursively as a sequence (a,,:::,a,) such that, for | S i <n, either a,;eES or a; 
is a list over S. In this definition n = 0 is permitted, so that we have a null list over S, 
denoted by ‘‘( )’’. The order of a list over S may also be defined recursively: ord(a) =0 
foraeS, ord(( )) = land, for n > 0, ord((a,, --:,a,)) = max(ord(a,), ---, ord(a,)) +1. 
For example, (3,( ),(2, (3, 1), 1),(()),2) is a list of order 3. The diagram for the 
second order list (2,(1, 2), 1) is 


|? | - ———_ | 
ot Feet 


As suggested by the diagram, the vertical arrow emanating from the second 
word in the upper row contains in its element field the location of the list (1,2) 
shown in the lower row of the diagram. To preclude ambiguity, each word of a list 
may contain also a type field to designate whether its element field is a list location 
or an atom (i.e., element of S). 

Now suppose that # is some commutative ring for which a list representation 
has been specified. For each ac &, let d be the list which represents a. Consider 
now the specification of a list representation for the polynomial ring &(x) of poly- 
nomials with coefficients in &. The most obvious possibility is to represent any 
polynomial A(x) = Xf 9a,x' with a, 4 0 by the list (d,,d,-1,°:,do), and to 
represent the zero polynomial by the null list (). However, when working with 
sparse polynomials, i.e., polynomials with many zero coefficients, this representation 


732 G. E. COLLINS [September 


is wasteful of storage. Another possibility is then to express A(x) in the form 
i a 1x, where e; > e, >--- > e&, and each a, 4 0, and to represent A(x) by 
the list (@,, 1, d2,€2,°°:, &, &,). Of course, for polynomials with few zero coefficients, 
this representation requires more memory than the former, but never more than 
twice as much. 

Polynomials in several variables are conveniently regarded as polynomials in 
one main variable with coefficients which are polynomials in the remaining variables. 
The result is a recursive representation for Z[x,,---,x,]. It has the advantage that 
recursive algorithms can naturally be used for various operations — algorithms 
which use themselves as subalgorithms, directly or indirectly, to perform the same 
operation on polynomials in fewer variables. For operations such as addition or 
multiplication such a recursive algorithm is merely simpler, but for other operations 
such as polynomial division the only known algorithms depend on the use of a main 
variable, and are thus inherently recursive. 

We denote by I, Q, R and C, respectively, the rings of the integers, the rationals, 
the real numbers and the complex numbers. For any polynomial A(x,,---,x,) over 
C we define two “‘norms’’, | A]. and | A], by induction on r. For r = 0(AeC), we 
define |A|,.=|A|,=|A]. For r>0, if AQxy,-,x,) = Lo AdX1, ++, Xp—1) Xr 
we define 


(1) |A]o = Maxg<;<,| A;| 
and 
(2) | Al 4 = * ware 


|A|,, and | A], are called the max norm and sum norm, respectively. One obtains 
easily the relations 


(3) |A+B|,, < |A|. +|B|., 
(4) JA+Bl, S |Ali +] Bis, 
(5) JABlo S [Alo [Bl 

(6) }A- Bl, S [Ala [Bli. 

and 

(7) Alo S|Als. 


If A(x,,-°:,X,) is a polynomial in r variables, we denote by 0,(A) the degree 
of A in x, and by o(A) the degree vector (0,(A),-:-,0,(A)). We shall often write 
deg(A) for 6,(A), the degree of A in its main variable. 

Let m; = 0,(A), n; = 0(B), 49 = |Aln, bo =|Blo, a, =|Ali, 01 = |Bl,- 
Clearly we can design an algorithm to add integral polynomials whose computing 


1973] COMPUTER ALGEBRA OF POLYNOMIALS AND RATIONAL FUNCTIONS 733 


time is dominated by 
(8) (L(do) + L(bo)} LT] (mitt. 


A ‘‘classical’’ algorithm for integral polynomial multiplication has computing time 
dominated by 


(9) L(ay)L(5,) if (m, + 1)(n; + 1). 


We can also design a ‘‘classical’’ algorithm for integral polynomial division with 
computing time dominated by (9) if we now let A be the divisor and B the resulting 
quotient. If we attempt to obtain a bound for the computing time of this algorithm 
depending only on the inputs, namely the divisor A and the dividend C, then we 
encounter the problem of determining an upper bound for | Bl... in terms of 
[Alas |C|., OA) and aC), where C = A: B. This problem appears to be no 
easier than the related problem of obtaining an upper bound for the norm of any 
divisor of A as a function of | A|, and O(A). This latter problem is one which is 
presently far from a satisfactory solution even for the case r = 1, but we will con- 
sider some of the known results. 

Define U(m, n, d) = max{| B|,,: A(x), B(x)eI[x]&B | A& deg(A) = m&deg(B)= 
n&|A|, = d} form 2n2zZ1andd 21. It is easy to see that 


(10) U(m,1,d) = U(m,m,d) = d. 


Let A(x), B(x) €1[x], B| A, deg(A) = m, deg(B) = nand|A|, = d. By considering 
the factorization of A over the field of complex numbers, it is not difficult to show 
that 


(11) |B|, s 


b,|- (a+ 1)", 


where b, = Idcf(B), the leading coefficient of B and d = max;<;<m| &; 
being the zeros of A. We can also show quite easily that 


» 1, °°", ky 


(12) a@<d-—l. 
Since |b,| S|a,,| Sd, where a,, = Idcf(A), we have by (11) and (12) that 
(13) U(m,n,d) <d"*?. 


A second approach to this problem yields a different bound. If c is any integer, 
then B(c)| A(c) in I. Hence if A(c) # 0, then | B(c)| < | A(c)| . Since A has at most 
m zeros, we can find n+ 1 integers c such that A(c) A 0 and |c| S$ [m+ n)/2], 
the least integer greater than or equal to(m+n)/2, and from the values of B(c), 
the coefficients of c can be determined by interpolation. By considering the sum 
norms of the Lagrange interpolation polynomials, we obtain 


734 G. E. COLLINS [September 


(14) U(m,n,d) S (m+ 1)"*"*"d. 


By elaboration of the interpolation approach, David R. Musser [50], obtains sharper 
bounds for | B|, under the assumption that A(c) # 0 for |c| S [n/2], c an integer. 

Can either of the bounds (13) and (14) be realized? A partial answer is obtained 
for the case d = 2 by taking A(x) = x”—1 and B(x) =@,,(x), the mth cyclotomic 
polynomial.®,, is an irreducible divisor of A of degree ¢(m), where @ is Euler’s func- 
tion. Paul Erdés showed in 1945 [28], that for some c, > 0, | ®,, | «o> exp{c,(log m)*! >} 
for infinitely many positive integers m. This implies that for every positive integer k 
there exist infinitely many pairs (m,n) for which 


(15) U(m, n,2) > m*. 


Thus U(m,n,d) is not bounded by any polynomial function of m,n and d, as one 
might otherwise conjecture. 

Now let A(X,,°::,x,) be a multivariate integral polynomial, with m,; = 0,A). 
By induction on r, the interpolation approach to factor coefficient bounding can 
be used to prove that if B | A, then 


(16) |Bl, S I] (mm Aly. 


We shall see in Section 5 that a bound such as (16) is essential for an efficient algo- 
rithm for computing the complete factorization of an integral polynomial. 

Thus far we have considered division in a polynomial ring &[x,,---,x,], with 
the implicit assumptions that Z is an integral domain and that the quotient is known 
to exist. If the quotient does exist then in an integral domain it is unique. If we drop 
the existence assumption then we have the problem of designing a trial division 
algorithm which, given A and B # 0, first decides whether B| A and then, if so, 
computes C = A/B. By induction on r, we may assume a trial division algorithm 
for @ and obtain one for &[x], obtaining thereby a recursive algorithm for 
Alx1,°-*,x,]. If A =0, then C = 0. Otherwise, B| A only if m 2 n and b| a, where 
m = deg(A), n = deg(B), a = Idcf(A) and b = Idcf(B). Ifm 2 nandc = a/b, then 
cx™”" will be a term of the quotient, if it exists, and the process is repeated with 
A(x) = A(x) — cx™” "B(x) in place of A(x). 

In the case & = I there does indeed exist an obvious trial division algorithm, 
and so we obtain one for I[x,,---,x,]. This algorithm is efficient for cases in which 
the quotient exists, and one would expect it to terminate quickly in most cases for 
which the quotient does not exist. However, rigorous analysis of either worst case 
or average behavior appears to be very difficult. After a few iterations of leading 
coefficient division and subtraction, the coefficients of the remainder can become 
very large, with the result that the computing time can be very large. Such extreme 
cases can be easily constructed with Ildcf(B) = 1, but other extreme cases are difficult 
to find. This is an excellent example of an algorithm whose behaviour is presumably 
satisfactory but not yet well understood. 


1973] COMPUTER ALGEBRA OF POLYNOMIALS AND RATIONAL FUNCTIONS 735 


In this section we have discussed some aspects of ‘‘classical’’ algorithms for per- 
forming the arithmetic operations in #@[x,,---,x,], with emphasis on the case # = I. 
Numerous computer program systems now exist with capabilities of this general 
nature, but with significant differences of importance to the user. The SAC-1 Poly- 
nomial System [18], provides polynomial arithmetic for 2 = I together with some 
other operations such as differentiation and substitution. The ALTRAN system [6] 
and [30], provides similar capabilities, but with limited-precision integer arithmetic. 
In ALTRAN, the polynomials may also have exact rational coefficients or approxim- 
ate real coefficients. Also, ALTRAN is one of the few current systems which does 
not use either the recursive canonical form for polynomials or a list representation. 
Other widely distributed systems with comparable capabilities for polynomial 
arithmetic are Reduce 2 [31], and PL/I-FORMAC [57]. 

As an illustration of actual computing times, it requires approximately one second 
to multiply two univariate polynomials of degree 30 with coefficients 10 decimal 
digits long, using SAC-1 on a UNIVAC 1108 computer. The time to multiply two 
bivariate polynomials of degree 6 in each variable with coefficients 30 decimal 
digits long is approximately 5 seconds. The time to divide A - B by A is approxi- 
mately the same as the time to multiply A by B. 


4. Polynomial greatest common divisors. Computing g.c.d.’s of multivariate 
integral polynomials is very important because, apart from other applications, it 
permits us to perform arithmetic in the fraction field consisting of the rational 
functions over the field Q of rational numbers, while keeping fractions reduced to 
lowest terms. 

There is a classical algorithm which is available for this purpose, namely the 
Euclidean algorithm for univariate polynomials over a field. Given A(x,,-::,x,), 
we can regard A as a polynomial in x, with coefficients in Q(x,,--:,x,_,). Assuming, 
inductively, a g.c.d. algorithm for I[x,,---,x,-,], we can perform the required 
coefficient arithmetic in Q(x,,---,x,—,) and obtain a g.c.d. algorithm for I[x,,---, x,]. 
In the case r = 1 we have available the Euclidean algorithm for integers as a basis. 

Let & be a g.c.d. domain, that is, an integral domain in which any two elements 
have a g.c.d. If a,beE#, we say that a is an associate of b, and write a ~ b in case 
there is a unit wu such that a = ub. “‘~”’ is an equivalence relation. ~ is called an 
ample set for & in case has exactly one element in each equivalence class of as- 
sociates. Relative to « we can write gcd(a, b) for the unique g.c.d. of a and b in &. 
If A(x) is any non-zero polynomial over #, A(x) = LX72,a,x', the content of A, 
written cont(A), is gcd(a@y,a@,,--:,a,) and the primitive part of A, written pp(A), 
is A/cont(A). Thus A = cont(A)o pp(A) and pp(A) is primitive, that is, its content 
is a unit. 

A(x] is also a g.c.d. domain, the units of @[x] are just the units of #, and the 
polynomials in @[x] with leading coefficients in « constitute an ample set for 
R\ x]. We have, for A(x), B(x)c A[x], A #0 and BF 0, 


736 G. E. COLLINS [September 


(17) ged(A, B) ~ gcd(cont(A), cont(B)) - gcd(pp(A), pp(B)). 


In fact, if we choose an ample set which is multiplicative (closed under multiplica- 
tion), then we shall have equality in (17). Therefore, it suffices to obtain a g.c.d. 
algorithm for primitive elements of &[ x]. 

If A and B are non-zero elements of @[x|, a remainder of A and B is a poly- 
nomial R such that for some polynomial Q and some c, dER,c #O0andd #0, 


(18) cA = BQ+dR 


and either R = U or deg(R) < deg(B). If Idcf(B) is a unit we always have a unique 
remainder with c = d = 1, called the natural remainder and denoted by rem(A, B). 
In general, if m = deg(A), and n = deg(B), we may assume m 2 n (otherwise A 
is a remainder with Q = 0 and c = d = 1) and then there is a unique remainder with 
c = b™""*! and d=1, b =Idcf(B), called the pseudo-remainder and denoted 
by prem(A, B). 

A polynomial remainder sequence (p.t.s) is a sequence of polynomials 
A,, A>, °*', A}, Ay44 = 0 in which A,;,, is a remainder of A; and A;,, forl Si<l. 
If A and B are nonzero polynomials, there always exists a p.r.s. with A, = A and 
A, = B. The nonzero polynomials A and B are similar, written A ~ B, in case 
forsomec,de&,c # 0Oandd #4 0,cA = dB. If A,, A>,-°:, A), Aj4, = Ois a p.rs. 
for A and B, then A, ~ gcd(A, B). Hence if A and B are primitive, then gcd(A, B) 
~ pp(A,). Since p.r.s.’s can be constructed in many different ways, this provides 
us with many possible g.c.d. algorithms, some of which are much better than others. 

If @ is a field, we can use the natural Euclidean algorithm, in which 
A;.2 = rem(A;, A;+1). Even if & is not a field (as for example @ = I[x,,---,x,_]) 
we can generate the natural Euclidean p.r.s. over the fraction field of 2, multiply 
A, by the least common multiple of its denominators to obtain A, € &[ x], and compute 
pp(4;). It turns out that the natural Euclidean algorithm is a very bad algorithm 
because the numerators and denominators of the A; grow very rapidly as i increases, 
and hence the computing time is large. The monic Euclidean algorithm, in which 
A; 2 is the monic polynomial which is similar to rem(A,, A;,,), is much better. 

Let us consider the special but illustrative case in which @ = I, deg(A) =n, 
deg(B) =n—1 and the p.rs. A,,A>,°°:,A;,,4;4, =O is normal, that is, 
deg(A;) — deg(A;.,) = 1 for 2 S i</. Suppose also that the coefficients of A and B 
are approximately d digits long. Then it can be shown that the numerator and de- 
nominator of A; in the natural Euclidean p.r.s. have approximately (i? — 3i + 3)d 
and (i? — 3i + 2)d digits, respectively, for 3 < i < |, and the computing time of the 
algorithm is codominant with d*n°L(n), approximately. By contrast, in the monic 
Euclidean algorithm the numerator and denominator of A; have approximately 
2i—3 digits, and the computing time of the algorithm is codominant with d?n*L(n), 
approximately. This illustrates the dramatic impact which can be made by a seemingly 
insignificant change in the p.r.s. 


1973] | COMPUTER ALGEBRA OF POLYNOMIALS AND RATIONAL FUNCTIONS 737 


We can avoid an excursion into the fraction field by using instead the primitive 
p.r.s. algorithm, in which A;,. = pp(prem(A;, 4;,,)). In this algorithm, typically 
only about one fourth as many g.c.d.’s in & are required as in the monic Euclidean 
algorithm. Since the g.c.d.’s in @ account for most of the computing time of either 
algorithm, we find that the primitive p.r.s. algorithm is several times faster, depending 
on &, but the computing times of the two algorithms are codominant if 
A = I[x,,°,x,|, r2 0. 

We can avoid all g.c.d.’s in #, except at the end in computing pp(4,), by setting 
A;+2. = prem(A;,A4;,,), obtaining the pseudo-remainder p.r.s. algorithm. How- 
ever, we find that in this case the length of the coefficients of A; grows exponentially 
as a function of i; as a result, the computing time of the algorithm is exponential 
in the degree n. 

In 1966, G. Collins discovered, [12] and [13], that every term of any p.r.s. for A 
and B, is similar to some subresultant of A and B, if A and B are polynomials of 
positive degrees over any integral domain #. Let m = deg(A), n = deg(B) and 
0O< k <min(m,n). Let M be the square matrix of order m+n whose successive 
rows are the coefficients of x" 'A(x),:--,xA(x), A(x),x™ ' B(x), ++, X B(x), B(x). 
M is the Sylvester matrix of A and B, and det(M), the determinant of M, is the 
resultant of A and B. More generally, let M, be the matrix whose rows are the 
coefficients of x" *~ * A(x), ---, A(x), x” “~ *B(x), ++, B(x). M, has m+ n—2k rows 
and m+n-—k columns. For 0 <j <k, let M, ; be the square matrix consisting 
of the first m + n—2k—1 columns of M, followed by column m+n—k—j. Then 
Sx) = Li -o det(M,. px? is the kth subresultant of A and B, a polynomial of 
degree k or less. 

By 1968 Collins, and independently W. S. Brown, had proved the fundamental 
theorem of p.r.s.’s, Which gives explicit formulas relating the terms of any p.r.s. 
to their similar subresultants. Let A,,A,,-:-,A,;, A,,,; bea p.r.s. of A and B defined by 


(19) €;A; = Aj4 10; + fiAi+2 


for! si S$ /—1(f,_, isarbitrary). Let c; = ldcf(A;), n; = deg(A;) and 6; = n;—1n,4,. 
The fundamental theorem states that 


k-1 
I] (- pers matarmdparaetnr 9 chet tA 
i=2 
k-1 
(20) = {1 ent | S, @Gsks), 
i=2 


k-1 

i -tann-1tly\(nj-anp-it dyeniarny- +1 di- +0; 
TI (— 1)" 1M -1 (nj-nK- 1 pnt ko c 1 | A, 
i=2 


k-1 
(21) _ TI pe destg 8 <k SI), 


738 G. E. COLLINS [September 


(22) j= 90 for n<j<m-_,—1 Bsksb, 
and 
(23) S; = 0 for 0 Sj <n). 


A proof of the fundamental theorem appears in [7]. 

The coefficients of the subresultants are determinants of known order, so the 
degrees and coefficient sizes of these coefficients can be estimated in terms of the 
degrees and coefficient sizes of the coefficients of A and B. Using the fundamental 
theorem, one can then estimate degrees and coefficient sizes for a specified p.r.s., 
and from this the computing time of a g.c.d. algorithm based on this p.r.s. All of 
the results stated or referred to above were obtained in this way and it is difficult 
to imagine that they could have been derived otherwise. 

Two new types of p.r.s.’s were introduced by Collins in [13] which avoid g.c.d. 
calculations in @ but which, unlike the pseudo-remainder p.r.s., control coefficient 
growth in most cases. The reduced p.r.s. is defined by setting e, = chit for 
Isi<l, f, =1 and f;=e,-, for 2S i<I-—1 in (19). The subresultant p.r.s. 
is defined by A; = S,,_,-1, for 3 Si $/+1, but a process was obtained for 
computing A;,, from prem( 4;,A4;,,) and previous terms of the sequence. 
In either p.r.s., prem(A;,A,;+1) is divided by certain powers of the c, for j Si to 
obtain A;,,, but the coefficients of the A; remain in #@ for any integral domain &. 
This is true by definition for the subresultant p.r.s. and by the fundamental theorem 
we have 

k-2 


(24) A, = [TI] (-ererncrt perenne nego] pyiermtsy. 
i=2 
B<k<)b, 


where A,, A2,---, A), Aj4, = 0 is the reduced p.r.s. and c, = Idcf(A;). Each exponent 
0,-,(0;—1) of c; is non-negative and indeed, each exponent is zero just in case the 
p.r.s. is normal, in which case the two p.r.s.’s differ only in signs. For randomly 
chosen polynomials A and B the p.r.s. is almost always normal as observed by 
Knuth ([40], Sec. 4.6.1). In such normal cases coefficient growth is controlled by 
the reduced p.r.s. algorithm; its computing time is codominant with that of the 
primitive p.r.s. algorithm, but typically several times faster. However, in severely 
non-normal cases coefficient growth and computing time for the reduced p.r.s 
algorithm can be exponential in the degree n. The subresultant p.r.s. algorithm 
controls coefficient growth in these non-normal cases, but it performs so many 
multiplications and divisions in doing so that its computing time is also exponential. 

The foregoing developments relating to the theory of polynomial remainder 
sequences and subresultants did much for polynomial g.c.d. calculation, but the 
problem remained in a somewhat unsatisfactory state prior to the introduction 
of modular algorithms in 1968. 


1973] |= COMPUTER ALGEBRA OF POLYNOMIALS AND RATIONAL FUNCTIONS 7139 


In order to focus on the essential ideas, let us consider first just the case # =I, 
and later generalize to @ = I[x,,-:-,x,|. There is a unique homomorphism, Pps 
of I onto the finite field GF(p) = {0,1,---, p—1} of the integers modulo p such that 
$1) = 1, for any prime p; @, is called a modular homomorphism. It is easy to 
see that if ¢ is any homomorphism from any g.c.d. domain, &, to another, S, then 
o(gcd(A, B))| ged(#(A), ¢(B)) for any polynomials A and B over #2. Hence 
if ¢(A) #9 or G(B) # 0, then deg(ged(¢(A), o(B))) 2 deg(P(gcd(A, B))). If 
deg(gcd($(A), $(B))) = deg(p(gcd(A, B))), then gcd(¢(A), o(B)) ~ P(gcd(A, B)). Let 
C = gced(A, B) and k = deg(C). Assume @(Idcf(A)) 4 0 and ¢(Iidcf(B)) # 0. Then 
o(S,(A, B)) = S,(@(A), @(B)), where S,(A, B) denotes the kth subresultant of A and 
B. By the fundamental theorem, then, deg(gcd((A), #(B))) = k if and only if 
deg(P(S,(A, B))) = k, and this is equivalent to the condition that @(det(M, ,)) # 0, 
since det(M,,,) is the leading coefficient of S,(A, B). 

Thus in the case 2 = I, gcd(¢,(A), $,(B)) = $,(gcd(A), gcd(B)) unless p divides 
Idcf(A), Idcf(B), or det(M,,,,). There are clearly only a finite number of such unlucky 
primes. We also have, for any &, Idcf(ged(A, B))| gcd(idcf(A), ldcf(B)). Thus there 
is a polynomial C similar to C whose leading coefficient is ¢ = gcd(a,b) where 
a = Idcf(A) and b = Idcf(B). For the field GF(p) we can use {0,1} as ample set, 
so that C} = gcd(¢,(A),¢,(B)) is monic. Multiplying C* by é* = @,(é), we ob- 
tain CX and C,.*= ¢,(C) provided p is lucky. From a sufficient number of such 
Ce we can compute C using an algorithm for the Chinese remainder theorem, and 
then C = pp(C), assuming as before that A and B are primitive. 


In following this plan, one uses a precomputed list of distinct primes, each nearly 
as large as can be stored in one computer memory location. An a priori bound U 
is computed for det(M,,,) and the coefficients of C. Primes which divide a or b, 
or which produce non-minimal g.c.d. degrees relative to other primes, are rejected. 
When primes p,,---, p, with [| ],_,p; > 2U have been used and retained, the Chinese 
remainder theorem is applied. This algorithm has a computing time dominated 
by n°L(d)?, where the inputs have degrees of n or less and sum norms of at most d. 


The algorithm just described ordinarily uses far more primes than actually 
necessary. In place of this a priori bound method one may use a trial division 
method, together with an iterative algorithm for the Chinese remainder theorem. 
After processing each prime, the Chinese remainder theorem is applied to incorporate 
the C* just computed. If the result C so obtained is unchanged from the previous 
application, trial divisions of ¢A and ¢B by C are performed, using a modular al- 
gorithm. If C is a common divisor of GA and éB, then C = C and C = pp(C); 
otherwise another prime is processed. If deg(C ° ) < deg(C) then C is discarded and 
a new C is formed from CF . The trial divisions are themselves performed by a modular 
algorithm. Provided that only a negligible number of unlucky primes are processed, 
and provided that C, A/C and B/C have sum norms of d or less, the computing time 
of this g.c.d. algorithm is dominated by n*L(d) + nL(d)?. One may argue intuitively 


740 G. E. COLLINS [September 


that this assumption will almost always be satisfied and that accordingly the average 
computing time of the algorithm is codominant with n?L(d) + nL(d)*. But the 
maximum computing time of the algorithm may be as large as n°L(d)*. The al- 
gorithm is further enhanced by designing it to terminate with C = 1 whenever 
a prime p is found for which C} = 1. If C = 1 then CF = 1 for every lucky prime p, 
so one may likewise argue intuitively that the average computing time of the algo- 
rithm is codominant with n* + nL(d) for relatively prime inputs. 

The method just described generalizes readily to produce a g.c.d. algorithm for 
I[x,,-°',x,], assuming the availability of a g.c.d. algorithm for GF (p)[x,,---,x,]. 
The obvious method is to regard A(x,,---,x,)€1[.x,,°-:,x,] as a polynomial in x, 
with coefficients in I[x,,-:-,x,-1]. One must then begin by computing the primitive 
parts of A and B, so a g.c.d. algorithm is also required for I[x,,---,x,_,]. Brown 
observed [5], however, that a more efficient algorithm is obtained by treating the 
variables symmetrically. Define the integer content of A, cont,(A), as the g.c.d. 
of the integer coefficients of A and define the integer primitive part of A by 
pp,(A) = A/cont,(A). Then the role of (17) is played by the formula 


(25) gcd(A, B) ~ gcd(cont,(4), cont,(B)) - ged(pp,(A), pp,(B)). 


And in place of rejecting primes which produce g.c.d.’s of non-maximal degree 
in x,, we reject all those which produce g.c.d.’s with non-maximal degree vectors, 
relative to some lexicographical ordering. Furthermore, we replace the leading 
coefficient of A by the integer leading coefficient of A, Idcf,(A), and set 
¢ = gcd(Idcf,(A),ldcf,(B)). Using these replacements, Brown’s method avoids some 
g.c.d. calculations in I[x,,--:,x,_,] at a cost of numerous g.c.d. calculations in I. 
The codominance class of the computing time is not affected, but experiments con- 
ducted by the author showed a substantial speed advantage for Brown’s method. 
A completely analogous method may be used to obtain a g.c.d. algorithm for 
GF(p)[x1,°--,x,], given one for GF(p)[x,,---,x,-,] whenever r = 2, with eval- 
uation homomorphisms replacing modular homomorphisms. The evaluation homo- 
morphism w,, for ae GF(p), is defined by w,(A(x,)) = A(a). For evaluation homo- 
morphisms, the analogue of the Chinese remainder is interpolation, and degree 
bounds replace integer coefficient bounds. Actually, an evaluation homomorphism 
is a special type of modular homomorphism, in which the modulus is the irreducible 
polynomial x,— a, and interpolation is a special form of the Chinese remainder 
theorem; see [44] for an excellent comprehensive treatment of the Chinese remainder 
theorem and interpolation. Brown’s symmetric treatment is again available, for 
the variables x,,---,x,, with GF(p)[x, ] taking the place of I, and the trial divisions 
are themselves performed using evaluation homomorphisms and interpolation. 
Brown has further improved the algorithm just described by eliminating the trial 
divisions. Let A, = A/C and B, = B/C; A, and B, are called the cofactor poly- 
nomials of A and B. A, and B, are similar to polynomials A, and B, withldcf,(A,)= @ 
and Idcf,(B,) = b, and we have A,C = A,B,C = €B. Hence whenever p is lucky 


1973] COMPUTER ALGEBRA OF POLYNOMIALS AND RATIONAL FUNCTIONS 741 


we have $,(4;) = $,(2)$,(A)/C, = A, and $,(B,) = $,(2),(B)/C, = By. By ap- 
plying the Chinese remainder theorem to the A *and B* A, and B, can be computed 
for which A,C = GA (modulo M) and B,C = cB (modulo M), where M is the 
product of all primes used to obtain C via the Chinese remainder theorem. Hence if 
M > 2max{|éA|,,|4,|,| C],} then 4,C = GA, anda test of this inequality replaces 
trial division of GA by C. The cofactor polynomials are frequently a useful byproduct 
ofthe g.c.d. calculation; for example, if A and B are the numerator and denominator 
of a rational function, then A, and B, are the numerator and denominator after 
reduction to lowest terms. 

If the degrees of A and B are at most n in each of the r variables, and if the sum 
norms of A, B, C, A, and B, are all at most d, and if a negligible number of un- 
lucky homomorphisms are encountered, then, as follows from the discussion in [5], the 
computing time of this modular g.c.d. algorithm is dominated by n'*'L(d) + n"L(d)’. 
Hence one may conjecture that the average computing time of the algorithm is co- 
dominant with n’**L(d) + n'L(d)*. This is the same as the time required to multiply 
A by B, using a modular algorithm, and much less than the time to multiply A by B 
using the classical algorithm. 

The SAC-1 subsystem [20] provides modular algorithms for g.c.d. calculation, 
resultant calculation, multiplication, division and trial division in I[x,,---,x,]. 
As an example of actual computing times, let A, B and C be pairwise relatively prime 
univariate integral polynomials of degree 40 with coefficients which are eight decimal 
digits long. Table 2 gives computing times in seconds for the SAC-1 system on a 
UNIVAC 1108 computer for various operations and algorithms. More extensive 
tables are given in [20]. 


TABLE 2. Polynomial Algebra Computing Times 


gcd (A, B) 

gcd (AC, BC) 
A> C (modular) 
A: C (classical) 
AC/C (modular) 
AC/C (classical) 


5. Polynomial factorization. The spectacular progress of recent years in obtain- 
ing efficient algorithms for polynomial g.c.d. calculation has been matched by equally 
significant progress in obtaining efficient polynomial factorization algorithms. As 
one might easily guess, the factorization problem is much more difficult than the 
g.c.d. problem, and so the current state of development of factorization algorithms 
is less advanced, but the achievements of the last five years have been remarkable! 

If MW is any unique factorization domain (u.f.d.) and a is any non-zero element 
of &, a can be expressed in the form a = u] |,_,a{* where u is a unit, the a, are 
irreducible, and the e; are positive integers. Such an expression is unique to within 


742 G. E. COLLINS [September 


associates and the order of the irreducible factors. We will assume that we are given 
a multiplicative ample set 7 for YW. If we then require that the a;e.% and assume 
also that ae. then we must have u = 1 and the set of pairs (a,, e,;) is unique. We 
may then refer to {(a,,e,),---,(a,,e,)} as the complete factorization of a. Also, we 
are interested in the case that Y& is a polynomial domain %,[x] and YW, has a multi- 
plicative ample set ,). We will then always choose as ample set for Y& the set 7 
consisting of all polynomials over YM with leading coefficients in ./,. Also, we shall 
be concerned primarily with the cases W = GF(q)[x,,--:,x,] and W = I[x,,---,x,] 
although several of the algorithms to be discussed are of more general applicability. 


The ‘‘classical’’ algorithm for polynomial factorization is Kronecker’s method, 
which goes back at least to 1882 [42]. Kronecker’s method is applicable whenever 
the coefficient domain QI, is infinite and has only a finite number of units, and pro- 
vides us with a complete factorization algorithm for M{>[x] whenever we already 
have one for %,. In particular, it provides a complete factorization algorithm for 
I[x,,°-,x,] by induction on s. 

For every positive integer k, Kronecker’s algorithm enables us to find all factors 
of degree k of a given primitive polynomial A. From any such algorithm we can 
obtain an algorithm for the complete factorization of any polynomial A, as follows. 

1. Set B = pp(A) and set k = 1. 

2. Find all factors of B of degree k. 

3. Divide B as often as possible by the factors found in step 2, and add 1 to k. 

4. If k < deg(B)/2, go back to step 2. 


Whenever step 2 is performed, the polynomial B has no factors of degree less than k, 
and so the irreducible factors of pp(A) are all the factors found in step 2 together 
with the polynomial B (unless B is a unit) after step 4 is last performed. The multi- 
plicity of each factor is discovered in step 3, and the content of A is factored separately 
in YW. 

If A and Bare polynomials over {) and B| A, then B(a)| A(a) for every ac Ap. 
If A(a) ¥ 0, then A(a) has only a finite number of divisors since YM, has only a finite 
number of units. Since YW is infinite we can find k +1 distinct elements of Wp, 
SAY do, 4,,°°°,a,, such that A(a;) 4 0 for all i. If F; is the finite set of factors of 
A(a;) and W, is the fraction field of 2[9, we can compute by interpolation the set 
of all polynomials B over MU, of degree k or less such that B(a,)eF,forO Sisk. 
After discarding those polynomials B of degree less than k and those with coefficients 
not in %,, Kronecker’s algorithm continues by performing a trial division of A 
by each remaining polynomial. 

Unless %{, has characteristic zero it has at least two units, each F, has at least 
two members, and the number of interpolations performed, for given k, is at least 
2*** If A is irreducible and of degree n, then k will assume all values up to [n/2], 
so the maximum computing time of Kronecker’s algorithm is an exponential function 
ofn.In case A) = J, the maximum computing time of the algorithm for polynomials 


1973] | COMPUTER ALGEBRA OF POLYNOMIALS AND RATIONAL FUNCTIONS 743 


with norms less than d will also be an exponential function of L(d), for there is no 
know» algorithm for complete factorization in J whose computing time is dominated 
by a polynomial function of the length of the integer to be factored. Numerous 
devices have been proposed for making Kronecker’s algorithm more efficient (see, 
e.g., [38]) in the case YM) = I, but these devices do not eliminate the exponential 
character of the algorithm, nor do they succeed in making the algorithm practically 
useful. 

Kronecker’s algorithm is also applicable when %,) = GF(q)[x,,-°°-,x,], but 
again not very practical. It is not applicable when 2[, = GF(q) and this is the case, 
as one might guess by analogy with the g.c.d. problem, which is crucial for the 
other cases. No general and reasonably efficient algorithm was known for the 
complete factorization of univariate polynomials over a finite field until 1967, when 
such an algorithm was discovered by Elwyn R. Berlekamp. Since Berlekamp’s 
algorithm ([2]; see also [3], Section 6.1, and [40], Section 4.6.2) is applicable only 
to squarefree polynomials, we shall first consider how the reduction of the problem 
to this case can be achieved. 

If [ ];-1A7! is the complete factorization of A in any u.f.d. UW, then A is square- 
free in case each e; = 1. The product A = |[]f.,A; is the greatest squarefree 
divisor of A, A = gsfd(A). Let k = max,<;<,e,; and define B, = [],,-,A; for 
1<sj<k. Then 


k 
(26) A=] B 
j=l 


is the squarefree factorization of A. The B, are squarefree and pairwise relatively 
prime, and B, # 1. Conversely, these conditions completely characterize the B,. 

Now assume that Y% is a polynomial domain, and denote by A’ the derivative 
of A. If M& has characteristic zero then ged(A, A’) =[]j-,4f' * =[]j=22.B)', and 
SO 


(27) gsfd(A) = A/gced(A, A’). 


From the complete factorization of A = gsfd(A), we can obtain the complete 
factorization of A by repeated trial division since A and A have the same irreducible 
divisors. However, it will generally be advantageous to first compute the squarefree 
factorization and then compute the complete factorization of each B, which is not 
unity. 

For a domain of arbitrary characteristic, define 


0, if (A%)’ = 0 
(28) 0; = . \, 
1, otherwise 
(29) B; = I] A;, 


e; =jRd;=1 


744 G. E. COLLINS [September 


and 
(30) S= [| 4f. 
6;=0 
Then 
k 
(31) A= {11 Bi} Ss. 
j=1 


and Musser [50], has given an algorithm which computes the B, and S. If S = 1, 
in particular whenever the characteristic is zero, we obtain the squarefree factori- 
zation. Let C,; = {[]j=:4,8/ '}S and D, = []j-,B, for 1 < i<k+1. Musser’s 
algorithm computes C, = gcd(A, A’), Dy = A/C,, D;4, = gcd(C,, D,), C;4,=C,/D;,,, 
B, = D,/D;.,, and terminates when D; = 1 with i =k+1andC,=S. 

If S #1 and YW is a domain of multivariate polynomials, say Wo[x,,---,x,], 
Musser’s algorithm can be applied to S with some variable x; in place of x such 
that 0S/0x; # 0 and a further factorization of S will result. If finally some S # 1 
is obtained with 0S/dx, = 0 for alli, and if YW, is a finite field, GF(p"), then Musser 
observes that S is the pth power of some polynomial T. We shall have 
S(X15 0°53 Xs) = Lya,xk?it---x?*s and then T(x,,-,x,) = Da? xfit--- xfs where 
a}!P = ae", 

Additions and subtractions can be performed in GF(q) in time dominated by 
L(q), multiplications and divisions in L(q)*. If ae GF(q), a* can be computed 
using at most 2 log,k multiplications (see [40], Section 4.6.3), so the time to compute 
a‘/? in GF(q) is dominated by L(q)°. 

The input to Berlekamp’s 1967 algorithm is a monic squarefree univariate poly- 
nomial A over GF(q). The algorithm has two phases. The first phase determines 
the number, r, of irreducible monic factors of A, and has a computing time domi- 
nated by n°L(q)* + n*L(q)?. If r =1, then A is irreducible and the algorithm 
terminates. Otherwise, the second phase must be performed, for which the computing 
time is dominated by n’rgL(q)’. 

Berlekamp’s algorithm therefore provides an efficient irreducibility test for uni- 
variate polynomials over GF(q). If A is a squarefree univariate integral polynomial 
then discr(A), the discriminant of A, is a non-zero integer. If p is a prime which is 
not a divisor of Idcf(A), then discr(¢,(A)) = @,(discr(A)), so if p does not divide 
discr(A) then discr(¢,(A)) # 0 and @,(A) is squarefree. Hence @,(A) is squarefree 
for all but a finite number of primes, and we can readily find a prime for which 
¢,(A) is squarefree; in fact, for given p, @,(A) is squarefree with probability 1—1/p 
({40], Section 4.6.2). If ¢,(A) is irreducible then so is A, and we may seek to prove 
the irreducibility of integral polynomials in this manner. However, if deg(A) = n 
the probability that ¢,(A) is irreducible is only about 1/n ([40], Section 4.6.2), so 
the average number of trials required will be about n. Moreover, one cannot decide 
irreducibility in this manner for there are irreducible polynomials A such that 


1973] COMPUTER ALGEBRA OF POLYNOMIALS AND RATIONAL FUNCTIONS 745 


,(A) is reducible for every prime p. A simple example is the polynomial A(x) =x* +1 
([40], Section 4.6.2). 

One might consider a complete factorization algorithm for primitive squarefree 
univariate integral polynomials based on the Chinese remainder theorem. After 
computing a bound B on the coefficients of the divisors of A, one would factor 
o,(A) for primes p,,---,p, such that []/~1p; > 2B. However, the time to apply 
the second phase of Berlekamp’s algorithm to all of the @,(A) may be proportional 
to B, and B itself may be an exponential function of n as observed in the previous 
section. Moreover, if A has two irreducible factors, A, and A,, of the same degree, 
there is no apparent way to distinguish @,(A,) from @,(A,) in applying the Chinese 
remainder theorem, so it will have to be applied in all of the 2*~ ‘possible ways. 

A superior alternative to the Chinese remainder theorem is provided by Hensel’s 
p-adic lemma, [35], as proposed by Hans Zassenhaus in 1968, [57]. If $,(A) is 
squarefree, from the complete factorization of A modulo p/ we obtain by Hensel’s 
algorithm the complete factorization of A modulo p’*'*. Starting with j = 1, we 
eventually obtain the complete factorization of A modulo m = p“ with p* > 2B. If A 
has r irreducible monic factors modulo p, then A also has exactly r irreducible monic 
factors modulo m = p*. Thus A = aA,:::A, (modulo m), where a = Idcf(A) 
and the A; are monic integral polynomials, irreducible modulo m. If B is an irredu- 
cible factor of A over the integers, then B is similar to a polynomial B with ldcf(B) = a. 
B must then be the product, modulo m, of a and some subset of S = {A,, + A,}, 
and B = pp(B). By considering smallest subsets of S first, the irreducible factors 
of A over I can be obtained after at most 2’~*' trial divisions. 


Musser [50], has shown that the time for application of Hensel’s algorithm 
is dominated by n?L(m)? + nL(m)?L(c) where n = deg(A) and c = |A\|,. In fact, 
there is a ‘‘quadratic’’ version of Hensel’s algorithm which proceeds from a fac- 
torization modulo p’ to one modulo p*’, and Musser has shown that the time 
for this version is dominated by n*L(m)* + nL(m)L(c). Since L(m) < nL(c) for one 
of the bounds considered in Section 3, the time to obtain the factorization modulo m 
can be dominated by n*L(c)?. 

If A is irreducible, but @,(A) splits into linear factors in GF(p), then in fact 
2"-* — 1 subsets of S will have to be processed and the computing time of the al- 
gorithm will be exponential in n. It would therefore seem advisable to factor $,(A) 
for several small primes for which @,(A) is squarefree and then apply Hensel’s 
lemma for a p which produces the smallest number of irreducible factors. But how- 
ever many primes are used, the maximum computing time of the resulting algorithm 
will still be exponential in n, for H. P. F. Swinnerton-Dyer has shown (see [4]) 
that there are irreducible integral polynomials of degree n which have at least n/2 
irreducible factors modulo p for every prime p. By considering the norms of these 
Swinnerton-Dyer polynomials Musser (unpublished paper) has shown that the 
maximum computing time of any ‘‘Berlekamp-Hensel’’ algorithm is not dominated 


746 G. E. COLLINS [September 


by any polynomial function of n = deg(A) and c = | A 1 . Musser also reestablishes 
this by consideration of cyclotomic polynomials. 

Nevertheless, the average computing time of the Berlekamp-Hensel algorithm 
in which one uses the first p for which @,(A) is squarefree may have an average 
computing time which is dominated by a polynomial function of n and L(c). For, 
if A,,, is the average number of irreducible monic factors of $,(A) for a random 
polynomial A of degree n, then lim,,.,.A,, = H, = Lj=,1/i (see [40], Section 
4.6.2, or [3], Chapter 3), and H, < In(n) + 1. So the number of modulo m factors 
which must be considered for an “‘average’’ polynomial of degree n does not exceed 
qinn —n In 2_ n0:093- <n. 

A version of the Berlekamp-Hensel algorithm with some additional improvements 
has been detailed, implemented as part of the SAC-1 system, and subjected to ex- 
perimentation by Musser [50] and [26]. Musser’s algorithm contains a parameter v, 
and for each of v primes p for which @,(A) is squarefree, it computes the degree set 
of @,(A), which is the set consisting of the degrees of all factors of @,(A). The degree 
set of A must be contained in the intersection of the degree sets of the @,(A), and 
this is used to reduce the number of modulo m factors which must be tried. From 
among these v primes, Hensel’s algorithm is applied to one for which the number r 
of irreducible factors is least (unless r = 1). 

The degree set of $,(A) is obtained not by use of Berlekamp’s algorithm, but 
by use of a distinct degree factorization algorithm described in [40]. Given a monic 
squarefree polynomial D over GF(q), this algorithm computes polynomials B,,---, B, 
such that B = []_,B;, and B;, is the product of all monic irreducible divisors 
of B of degree i. This algorithm has a computing time dominated by n?L(q)? +. n?L(q)°; 
this is the same time as the first phase of Berlekamp’s algorithm, which only deter- 
mines the number of irreducible divisors, while the degree set is determined by the 
distinct degree factorization. 

Musser’s algorithm further reduces the number of modulo m factors which 
must be computed, by using a trailing coefficient test. For each selected subset of 
the set S of irreducible modulo m factors, the trailing coefficient of the product is 
computed as the product of the trailing coefficients. Only if this product divides 
the trailing coefficient of A does the algorithm compute the polynomial product 
and perform a polynomial trial division. 

Musser applied his algorithm, implemented on a UNIVAC 1108 computer, and 
with v = 5, to 38 randomly generated polynomials with degrees ranging from 10 
to 20 and with coefficient sizes ranging from 2’ to 27'. Some were irreducible and 
others were reducible, having been generated as products of polynomials of degrees 
2, 3 and 5, or 3, 5 and 7. The computing times for these examples ranged from 
0.38 to 27.36 seconds. The irreducible polynomials were often quickly detected 
using the degree sets; but in several cases Hensel’s algorithm had to be applied and 
in these cases the computing times were much the same as for the reducible poly- 
nomials. It is also interesting to observe that in none of the cases did the time to proceed 


1973] | COMPUTER ALGEBRA OF POLYNOMIALS AND RATIONAL FUNCTIONS 747 


from the modulo m factorization to the factorization over I take more than 7% of 
the total time even though this is the part of the algorithm which is least satisfactory 
from a theoretical viewpoint. 

In 1969, Berlekamp devised a new algorithm for the factorization of squarefree 
univariate polynomials over GF(q), [2]. Berlekamp’s new algorithm has four parts. 
In part 1, distinct degree factorization is applied. In part 2, the factorization of a 
squarefree polynomial over GF(p) whose irreducible factors are all of the same 
degree is reduced to the factorization of some squarefree polynomials over GF(q) 
whose factors are all linear. In part 3 the factorization of a squarefree product of 
linear factors over GF(p”) is reduced to the same problem over GF(p). Finally, 
part 4 is an algorithm for factoring a product of distinct linear factors over GF(p). 
The computing time for each of the first 3 parts of Berlekamp’s new algorithm is 
dominated by a polynomial function of n = deg(A) and L(q). The algorithm of part 
4 has an average computing time dominated by n>L(p)° but its maximum computing 
time may be codominant with n*pL(p)*. (A slightly different version of the al- 
gorithm has a maximum computing time dominated by n*p'/*L(p)?!? , but its 
average computing time is not known to be dominated by a polynomial function 
of n and L(p).) 

We have discussed Hensel’s algorithm above as a means of obtaining a factori- 
zation of A modulo p* from a factorization of ¢,(A) over GF(p), where A is a 
polynomial over I, p is a prime integer, and ¢,(A) is squarefree. More generally, 
Hensel’s algorithm is applicable whenever A is a polynomial over a unique facto- 
rization domain #, p is an irreducible element of %, ¢, is the natural homomor- 
phism of ¥ onto ¥/(p), 4/(p) is a field, and @,(A) is squarefree over ¥/(p). If we 
set % = GF(q)(x,,-°';X,—1)[x,] and choose p as an irreducible polynomial of 
degree n in x, over GF(q), then #/(p) is GF(q")(x,,-:-,x,—,) and we obtain thereby 
a Hensel-Berlekamp algorithm for multivariate polynomials over any finite field. 
If we set % = Q(x,,---,x,-1)[x,], 7 2 1, and choose p as a polynomial x, — a, 
ael, then #/(p) = Q(x,,°::,X,-1) and we obtain a Hensel-Berlekamp algorithm 
for multivariate polynomials over 1. However, this approach involves rational 
function coefficients, and Musser [50], has devised a still more general Hensel 
algorithm which avoids this difficulty. As yet, none of these multivariate factoriza- 
tion algorithms have been implemented and tried, but the supporting theory and 
the experience to date with univariate factorization indicate their feasibility. 


6. Other operations. B. F. Caviness is developing a SAC-—1 system for operations in 
G[x,,°--,x,|, where G is the ring of Gaussian integers. Some interesting problems 
arise in this endeavour when one attempts to obtain optimal algorithms. For example, 
the classical algorithm for division of rational integers has a computing time codo- 
minant with L(b)L(a/b) for division of a by b. However, the obvious algorithm 
for division of Gaussian integers has a computing time codominant with L(b)L(a), 
where L(cy + icy) = L(ce + c?) ~ L(c$) + L(ci) ~ L(co) + L(c1). Does there exist an 


748 G. E. COLLINS [September 


algorithm for Gaussian integer division, using classical algorithms for rational 
integer arithmetic, whose computing time is dominated by L(b)L(a/b)? Caviness 
has adapted Lehmer’s rational integer g.c.d. algorithm to Gaussian integer g.c.d. 
calculation [9], and is developing a modular algorithm for g.c.d. calculation in 
G[x,,°-:,x,]. There is a potential application of this work in performing symbolic 
calculations with elementary transcendental functions [8], and in computing the 
complex zeros of a polynomial. 

Algorithms for arithmetic operations on rational functions are quite simply 
obtained in terms of arithmetic operations on polynomials and polynomial g.c.d. 
calculation. If 2{ is any g.c.d. domain with multiplicative ample set . and JY is the 
fraction field of 2, the elements of 2{ can be uniquely represented as pairs (a,, a>) 
such that a, # 0, gced(a,,a,) =1, and a,ex#. If U= A,[x,,---,x,], then 
may be defined from an ample set .7, for YX, as discussed earlier. If A(x,,---, x,) 
EQMlx,,---,x,] is regarded as an element of YW)[x,,---,x,-,][x,], the leading 
Y_,-coefficient of A is defined, recursively, as the leading %-coefficient of the leading 
coefficient of A if r>1, and as the leading coefficient of A ifr = 1. Then Ac WH 
just in case its leading %,-coefficient is in %,. If %, is the fraction field of MW, then 
J AX15°°' X,), the fraction field of Y o[x,,---,x,], is isomorphic with %U,(x,,---,x;,), 
the fraction field of W[x,,---,x,], and it is generally more efficient computationally 
to use the latter. This is the approach which has been used, for example, with YW, = I 
in the SAC-1 Rational Function System [19], which also provides rational number 
arithmetic as the special case r = 0. 

The obvious algorithms for addition and multiplication in a fraction field are 
susceptible of some improvements, as was observed by P. Henrici in 1956, [34], 
for rational numbers. If a,,a,¢ YW and a, # 0, let us write a,/a, for the unique 
pair (a@,,d,) such that d, 4 0,d,¢-% and gcd(d,,d,) = 1. The obvious algorithm 
for multiplication in Y applies the formula (a,,a,) - (b,,b,) = a,b,/a,b,. 
Henrici’s algorithm instead sets (4,,5,) = 4,/b,, (d2,6,) = a,/b,, and then 
(a1, 42) * (by, bz) + (4,51, d,b,). The obvious algorithm performs two multipli- 
cations and one reduction (the ‘‘ /’’ operation); Henrici’s performs two multiplica- 
tions and two reductions. To see how the Henrici algorithm can nevertheless be 
faster, consider the very special case in which a,, a,, b, and b, are pairwise relative 
prime integers, all of the same length d. If d is very large, most of the time for either 
algorithm will be used by the reductions. For some constant c, the time for each 
of the two reductions in the Henrici algorithm will be approximately cd*, while 
the time for the one reduction in the obvious algorithm will be approximately c(2d)’. 
Hence Henrici’s algorithm will be about twice as fast in this case. When I is a poly- 
nomial domain in several variables, the Henrici algorithm will generally be faster 
by a much larger factor. There is also a Henrici algorithm for addition, and a 
Henrici-type algorithm for diiferentiation of rational functions. 

The integral of a rational function is not in general a rational function. That is, 
not every rational function is the derivative of some rational function. However, 


1973] COMPUTER ALGEBRA OF POLYNOMIALS AND RATIONAL FUNCTIONS 749 


if W= Y(x), Y any field, and if (A,B)e UW then there exist unique polynomials 
C, D, E $[x] such that A/B = C’ +(D/B)’' + E/B, with C(0) =0, D =0 or 
deg(D) < deg(B), and E=0 or deg(E) <deg(B), where B = gcd(B,B’) and 
B = B/B = gsfd(B). It follows that A/B is ‘“‘integrable’’ just in case E = 0. In any 
case C’ is called the polynomial part of the integral of A/B and D/B is called the 
rational part. In the case ¥ = Q, {(E/B) isasum of logarithms &,-,%,log(x — B,), 
called the transcendental part. We shall refer to E/B as the remainder of the integral. 
There is a classical algorithm due to Hermite for computing the polynomial and 
rational parts and the remainder. The first phase of Hermite’s algorithm performs 
a squarefree partial fraction decomposition, expressing A/B in the form 


kj 
(32) A/JB=F+ % 2& G,,,/B,, 
j=. i 


i=1 


where []-,B/is the squarefree factorization of B, F is a polynomial, and the G,, , 
are polynomials with deg(G;,, ;) < deg(B;) or G;,, = 0. In the second phase, each sum 
X/_,G,,;/B} is integrated by parts, yielding 


J j71 
i=1 i=1 
Then 
A k j71 ° 
(34) DJB= >» H,,,/B; 
j=2 i=1 
and 
_ k 
(35) E/B= »% H,/B;. 
j=l 


For the case J = Q, Ellis Horowitz ([36], [37]) developed a modular version of 
Hermite’s algorithm, which was found to be much faster than a version using arith- 
metic in Q. Horowitz went on to show how the polynomials C, D and E could be 
computed directly by solving a system of linear equations, avoiding both partial 
fraction decomposition and integration by parts, and obtaining thereby a still faster 
algorithm for rational function integration. Both the modular version of Hermite’s 
algorithm and the new Horowitz algorithm are implemented for ¥ = Q in the SAC-1 
system [23]. 

Michael T. McClellan has recently devised modular algorithms for various 
operations of linear algebra on matrices over I[x,,-°-,x,], r 20. The modular 
algorithms include matrix multiplication, determinant calculation, matrix inversion, 
and solution of a general matrix equation AX = B. Of course, classical methods 
are available, which when implemented using modular algorithms for polynomial 
multiplication, division, and g.c.d. calculation, will produce quite efficient algorithms. 
However, McClellan has produced still more efficient algorithms by interjecting the 
modular methods at the matrix level. For matrix multiplication and determinant 
calculation, modular and evaluation homomorphisms may be applied in a straight- 


750 G. E. COLLINS [September 


forward manner, reducing the problem to matrices over GF(p). A modular algorithm 
for matrix inversion is equally trivial if one uses the formula A~* = adj(A)/det(A), 
computing simultaneously the adjoint and determinant of A and then reducing each 
element (A ‘), ;/det(A) to lowest terms with a modular polynomial g.c.d. algorithm. 
A modular algorithm for solution of the general matrix equation where the ranks 
of A and B are unknown is much more difficult. However, McClellan has devised 
an ingenious method for specifying a particular solution from among all solutions 
of AX = B and for rejecting all homomorphisms which fail to contribute to the 
determination of that particular solution. McClellan’s algorithm also detects cases 
in which AX = B is inconsistent and in cases where there are multiple solutions a 
basis for the null space of A is computed, from which all solutions are easily ob- 
tained. McClellan’s work is described in [47], [48] and [49]; his algorithms have 
been incorporated in the SAC-1 system [24]. 

Approximation of the real zeros of a univariate polynomial is ordinarily regarded 
as a numerical problem rather than as an algebraic problem, but Lee E. Heindel [32] 
and [33], has demonstrated the efficacy of an algebraic approach which uses infinite 
precision arithmetic. Heindel has developed an algorithm which, given as inputs 
any non-zero univariate integral polynomial A and any positive rational number e, 
produces as output a sequence I,,---,J, of disjoint intervals with rational endpoints, 
each of length less than e, such that if 7, <a, <--- <4, are the distinct real zeros 
(either simple or multiple) of A, then «,e¢J;. Heindel’s algorithm uses Sturm’s 
theorem, which, for any squarefree integral polynomial B, enables us to determine 
the number of real zeros of B in any left-open, right-closed interval (a,b|. Thus 
Heindel’s algorithm begins by setting B = gsfd(A), which has the same zeros as A 
and is squarefree. A negative p.r.s. (over any ordered integral domain) is a p.r.s. 
B,, Bz, ::-, B,, B,., = 0 satisfying, for some c;, d; and Q,, 


(36) cB; = Q;Bi+1 + d;Bis2, cid; < 0 


for 1 < i<s.A Sturm sequence for B is a negative p.r.s. for which B, = B and 
B, = B’. If we denote by V(a) the number of variations of sign in the sequence 
B,(a), B,(a),---,B,(a), then by Sturm’s theorem the number of zeros of B in 
(a,b] is V(a) — V(b). Heindel’s algorithm computes a primitive Sturm sequence, 
in which each B, is a primitive integral polynomial. Beginning with a single interval 
(—U,+ U], Heindel’s algorithm bisects intervals containing more than one zero 
and discards intervals containing no zeros, finally arriving at a sequence of isolating 
intervals, each containing exactly one zero of B. Some of the isolating intervals 
may be longer than ¢, but Sturm’s theorem is not needed to refine these since if (a, 5] 
contains at most one zero, then it contains one just in case B(b) = 0 or B(a)B(b) < 9. 
The computing time of Heindel’s algorithm is dominated by 


m>L(d)* + rm*L(dl)? + rm°L(dl)> + rm?L(de), 
Where m = deg(A), r is the number of real zeros of A, d = max{| A|,,|gsfd(A)|,}, 


1973] COMPUTER ALGEBRA OF POLYNOMIALS AND RATIONAL FUNCTIONS 751 


1 =[1/a], e = [1/e], and A = min, <;<,(a;4, — 4) if r= 2, 4 = 1 otherwise. By 
considering the discriminant of B = gsfd(A), it can be shown that, for r = 2, 


(37) Az a™(2QuUy m4 
where U is an upper bound on the zeros of A. Since d is an upper bound, we have 
(38) A= (2d) "4. 


It would be interesting to know whether there is a much sharper lower bound for A 
as a function of m and d, as one would expect. 

Heindel’s algorithm has been implemented in the SAC-1 system, [21]. Actually, 
it appears to be much faster than the theoretical analysis would suggest. For example, 
when applied to the Chebychev polynomials of various degrees, with e = 107 1°, 
the observed computing times in seconds on a UNIVAC 1108 computer were quite 
nearly proportional to the square of the degree, as follows: 


degree 10; 15, 20) 25 
time 20 | 46 {103 |171 © 

An approach similar to Heindel’s can be used to approximate the zeros, real 
and complex, of any univariate Gaussian polynomial A, 1.e., any univariate poly- 
nomial with Gaussian integer coefficients. As before, we begin by computing 
B = gsfd(A). Let B = By) + iB, where B, and B, have rational integer coefficients, 
let B = gcd(Bo, Bi), Co = Bo/B, C, = B,/B and C = C,+iC,. Then B = BC 
and every real zero of B is a zero of B, which can be computed by Heindel’s algo- 
rithm. C has no real zeros and so by applying the Routh-Hurwitz theorem to C 
we can compute the number of zeros of C above the real axis. Since the zeros of B 
occur in complex conjugate pairs, we can also determine the number of zeros of B 
in the upper half-plane using Sturm’s theorem, and hence the number of zeros of B 
in the upper half-plane. 

A Routh-Hurwitz sequence for C is any negative p.r.s. D,,D,,---,D,, D4, = 0 
in which D, = C, and D, = C,. If V(a) is the number of variations of sign in the 
sequence D,(a), D,(a),---,D,(a) then, according to the Routh-Hurwitz theorem, 
the number of zeros of C in the upper half-plane is 


(39) 4{deg(C) + V(o0) — V(—o)}. 


In a forthcoming Ph.D. thesis [53], James R. Pinkert will show how the Routh- 
Hurwitz theorem can be combined with rotations, translations and other devices 
to determine the number of zeros of a Gaussian polynomial in any rectangle with 
sides parallel to the axes, and from this will specify a complete algorithm for ap- 
proximating the zeros to desired accuracy. 

In this survey we have concerned ourselves primarily with operations on poly- 
nomials with integer, Gaussian integer, or rational number coefficients although, 
as we have seen, this leads naturally to coefficients from a finite field. Although many 


752 G. E. COLLINS [September 


challenging problems remain regarding the most efficient algorithms for such poly- 
nomials, the accomplishments of the last decade have been sufficient to justify 
attention in the years ahead to algorithms for operations on algebraic numbers 
and on polynomials with algebraic number coefficients. 

An obvious first step is to consider arithmetic in an algebraic number field Q(«). 
If we are given the minimal polynomial of a, that is the unique monic irreducible 
polynomial A with coefficients in Q satisfying A(«) =0, then this problem is theo- 
retically trivial since Q(«) is isomorphic to the residue class ring Q[x]/(A(x)), and 
each residue class may be represented by its unique element of degree less than 
n = deg(A). However, any non-zero polynomial B over Q can be expressed uniquely 
in the form 


(40) B=b:B, 


where b is a rational number and 8B is a primitive integral polynomial with positive 
leading coefficient, and one may ask whether this representation leads to more 
efficient algorithms for arithmetic in Q(@). 

If « is a real algebraic number then Q(q) is an ordered field and we may also 
require an algorithm for the order relation in Q(a). Equivalently, we seek an algo- 
rithm which decides whether any given element B(«) of Q(a) is positive, negative 
or zero. Since deg(B) <n we have B(a) = 0 if and only if B = 0. Thus, referring 
to (40), the problem is to decide whether B(«) is positive or negative. If A has more 
than one real zero, the answer may of course depend on which zero of A is denoted 
by a. One solution is to specify « by an isolating interval J, that is, an interval with 
rational endpoints containing « but no other zeros of A. Applying Sturm’s theorem 
to I we can decide whether J contains any zeros of B. If so, we can bisect J and ob- 
tain a smaller isolating interval for «. Since B(a) # 0 we eventually obtain an iso- 
lating interval (c, d | for « which contains no zeros of B, and then sign (B(«)) =sign(B(d)). 

Sturm’s theorem is applicable to polynomials with real algebraic coefficients, 
and so there is a potential of extending Heindel’s algorithm to polynomials over 
Q(a«). This of course leads to questions about optimal algorithms for computing 
g.c.d:’s of polynomials over Q(«) and for generating Sturm sequences over Q(a). 
These and some related problems are currently being investigated by Cyrenus M. 
Rubald [55]. 


Rudiger Loos [45], has recently extended some of these ideas to obtain interesting 
algorithms for arithmetic in the field R of all real algebraic numbers. Any element « 
of R is represented by a pair (A,J) where A is the minimal polynomial of « and I 
is a rational isolating interval for «. To add « and £, represented by (A, J) and (B, J) 
for example, we first compute the resultant C, with respect to y of A(y) and B(x—y), 
and the interval K =I+J = {a+ b:ael&beJ}. y =a+ is a zero of C and 
yéK, but C may not be irreducible and K may contain more than one zero of C. 
However, I and .J can be simultaneously refined until K = 1+ J is an isolating 


1973] | COMPUTER ALGEBRA OF POLYNOMIALS AND RATIONAL FUNCTIONS 753 


interval, C can be completely factored, and then the unique irreducible factor of C 
which has opposite signs at the endpoints of K has y as a zero. 


Research supported by National Science Foundation Grant GJ—30125X, the Wisconsin Alumni 
Research Foundation and the Stanford Artificial Intelligence Project. 


References 


1. A programming language for information processing on automatic data processing systems 
Comm. A. C. M., No. 10, 7 (Oct. 1964) 591-625. 

2. E. R. Berlekamp, Factoring polynomials over finite fields, Bell System Tech. J., 46 (1967) 
1853-1859. 


3. , Algebraic Coding Theory, McGraw-Hill, New York, 1968. 
4. , Factoring polynomials over large finite fields, Math. Comp. No. 111, 24 July 1970) 
713-735. 


5. W. S. Brown, On Euclid’s algorithm and the computation of polynomial greatest common 
divisors, J. Assoc. Comput. Mach., No. 4, 18 (Oct. 1971) 478-504. 

6. , ALTRAN User’s Manual (2nd ed.), Bell Laboratories, Murray Hill, N. J., 1972. 

7. W. S. Brown and J. F. Traub, On Euclid’s algorithm and the theory of subresultants, J. 
Assoc. Comp. Mach., No. 4, 18 (Oct. 1971) 515-532. 

8. B. F. Caviness, On canonical forms and simplification, J. Assoc. Comp. Mach., No. 2, 17 
(April 1970) 385-396. 

9, , A Lehmer-type greatest common divisor algorithm for Gaussian integers, Paper 
presented at SIAM-SIGNUM 1972 Fall Meeting, Austin, Texas, Oct. 1972. 

10. G. E. Collins, A method for overlapping and erasure of lists, Comm. ACM, No. 12, 3 (Dec. 
1960) 655-657. 


11. ———, PM-A system for polynomial manipulation, Comm. A. C. M., No. 8, 9 (Aug. 1966) 
578-589. 

12. , Polynomial remainder sequences and determinants, this MONTHLY, No. 7, 73 (1966) 
708-712. 

13. , Subresultants and reduced polynomial remainder sequences, J. Assoc. Comp. Mach., 


No. 1, 14 Jian. 1967) 128-142. 

14. , Computing time analysis for some arithmetic and algebraic algorithms, Proc. 
1968 Summer Inst. on Symbolic Math. Comp., pp. 195-231. IBM Federal Systems Center, 1968. 

15. , The computing time of the Euclidean algorithm, Stanford University Comp. Sci. 
Dept. Report No. CS-331, January 1973, 17 pages. 

16. , The SAC-1 list processing system, Univ. of Wisconsin Comp. Sci. Dept. Tech. 
Report No. 129, July 1971, 34 pages. 

17. , The SAC-1 integer arithmetic system-version III, Univ. of Wisconsin Comp. Sci. 
Dept. Tech. Report No. 156, July 1972, 63 pages. 

18. , The SAC-1 polynomial system, Univ. of Wisconsin Comp. Sci. Dept. Tech. Report 
No. 115, March 1971, 66 pages. 

19. , The SAC-1 rational function system, Univ. of Wisconsin Comp. Sci. Dept. Tech. 
Report No. 135, Sept. 1971, 31 pages. 

20. , The SAC-1 polynomial GCD and resultant system, Univ. of Wisconsin Comp. 
Sci. Dept. Tech. Report No. 145, Feb. 1972, 93 pages. 

21. G.E. Collins and L. E. Heindel, The SAC-1 polynomial real zero system, Univ. of Wisconsin 
Comp. Sci. Tech. Report No. 93, Aug. 1970, 72 pages. 

22. G. E. Collins, L. E. Heindel, E. Horowitz, M. T. McClellan and D. R. Musser, The SAC-1 


754 G. E. COLLINS [September 


modular arithmetic system, Univ. of Wisconsin Comp. Center Tech. Report No. 10, June 1969, 
50 pages. 

23. G. E. Collins and E. Horowitz, The SAC-1 partial fraction decomposition and rational 
function integration system, Univ. of Wisconsin Comp. Sci. Dept. Tech. Report No. 80, Feb. 1970, 
47 pages. 

24. G.E. Collins and M. T. McClellan, The SAC-1 polynomial linear algebra system, Univ. of 
Wisconsin Comp. Sci. Dept. Tech. Report No. 154, April 1972, 107 pages. 

25. G. E. Collins and D. R. Musser, Analysis of the Pope-Stein division algorithm, Univ. of 
Wisconsin Comp. Sci. Dept. Tech. Report No. 55, June 1969, 10 pages. 

26. G. E. Collins and D. R. Musser, The SAC-1 polynomial factorization system, Univ. of 
Wisconsin Comp. Sci. Dept. Tech. Report No. 157, March 1972, 65 pages. 

27. J. D. Dixon, A simple estimate for the number of steps in the Euclidean algorithm, this 
MONTHLY, 78 (1971) 374-376. 

28. P. Erdds, On the coefficients of the cyclotomic polynomials, Bull. Amer. Math. Soc. No. 2, 
52 (February 1946) 179-184. 

29. J. H. Griesmer and R. D. Jenks, SCRATCHPAD/1 — An interactive facility for symbolic 
mathematics, Proc. Second Symp. on Symbolic and Algebraic Manipulation, pp. 42-58, Assoc. Comp. 
Mach., 1971. 

30. A. D. Hall, Jr., The ALTRAN system for rational function manipulation — a survey, Proc. 
Second Symp. on Symbolic and Algebraic Manipulation, pp. 153-157, Assoc. Comp. Mach., 1971. 

31. A. C. Hearn, Reduce 2: A system and language for algebraic manipulation, Proc. Second 
Symp. on Symbolic and Algebraic Manipulation, pp. 128-133, Assoc. Comp. Mach., 1971. 

32. L. E. Heindel, Algorithms for exact polynomial root calculation, Univ. of Wisconsin Ph.D. 
Thesis, 1970, 153 pages. 

33. , Integer arithmetic algorithms for polynomial real zero determination, J. Assoc. 
Comp. Mach., No. 4, 18 (Oct. 1971) 533-548. 

34. P. Henrici, A subroutine for computations with rational numbers, J. Assoc. Comp. Mach., 
No.1, 3 (1956) 6-9. 

35. K. Hensel, Theorie der algebraischen Zahlen, Chapter 4, Teubner, Leipzig, 1908. 

36. E. Horowitz, Algorithms for symbolic integration of rational functions, Univ. of Wisconsin 
Ph. D. Thesis, 1969, 132 pages. 

37. , Algorithms for partial fraction decomposition and rational function integration, 
Proc. Second Symp. on Symbolic and Algebraic Manipulation pp. 441-457, Assoc. Comp. Mach, 
1971. 

38. S. C. Johnson, Tricks for improving Kronecker’s polynomial factoring algorithm, Bell 
Labs. Report, Murray Hill, N. J., 1966, 22 pages. 

39. D. E. Knuth, The Art of Computer Programming, Vol. I: Fundamental Algorithms, Ad- 
dison Wesley, Reading, Mass., 1968. 

40. , The Art of Computer Programming, Vol. II: Seminumerical Algorithms, Addison- 
Wesley, Reading, Mass., 1969. 

41. , Mathematical analysis of algorithms, Stanford Univ. Comp. Sci. Dept. Tech. Report 
STAN-CS-71-206, March 1971, 26 pages. 

42. L. Kronecker, Grundziige einer arithmetischen Theorie der algebraischen Grossen, Part I, 
Section 4, G. Reimer, Berlin, 1882. 

43. D. H. Lehmer, Euclid’s algorithm for large numbers, this MONTHLY, 45 (1938) 227-233. 

44. J. D. Lipson, Chinese remainder and interpolation algorithms, Proc. Second Symp. on Sym- 
bolic and Algebraic Manipulations, pp. 372-391, Assoc. Comp. Mach., 1971. 

45. R. Loos, A constructive approach to algebraic numbers. 

46. J. McCarthy et al., LISP 1.5 Programmer’s Manual, M.I. T. Press, Cambridge, Mass., 1962. 

47. M.T. McClellan, The exact solution of systems of linear equations with polynomial coeffici- 


1973] ON THE DISCRETE VERSION OF WIRTINGER’S INEQUALITY 755 


ents, Univ. of Wisconsin Comp. Sci. Dept. Tech. Report No. 136 (Ph. D. Thesis) Sept. 1971, 258 
pages. | 

48. , The exact solution of systems of linear equations with polynomial coefficients, Proc. 
Second Symp. on Symbolic and Algebraic Manipulation, pp. 399-414, Assoc. Comp. Mach., 1971. 

49. , The exact solution of systems of linear equations with polynomial coefficients, to 
appear in J. Assoc. Comp. Mach. 

50. D. R. Musser, Algorithms for polynomial factorization, Univ. of Wisconsin Comp. Sci. 
Dept. Tech. Report No. 134 (Ph. D. Thesis) Sept. 1971, 174 pages. 

51. A. Newell, J. C. Shaw and H. A. Simon, Empirical explorations of the logic theory machine, 
Proc. 1957 Western Joint Comp. Conf., pp. 218-230. 

52. J. R. Pinkert, SAC-1 Implementation of Toom’s Algorithm for Fast Multiplication of 
Large Integers in Radix Representation, unpublished report, August 1968. 

53. , Univ. of Wisconsin Comp. Sci. Dept. Ph. D. Thesis, in preparation. 

54. D. A. Pope and M. L. Stein, Multiple precision arithmetic, Comm. A. C. M. No. 12, 3 
(Dec.’1960) 652-654. 

55. C. M. Rubald, Univ. of Wisconsin Comp. Sci. Dept. Ph. D. Thesis, in preparation. 

56. J. Xenakis, The PL/1-FORMAC Interpreter, Proc. Second Symp. on Symbolic and Algebraic 
Manipulation, pp. 105-114, Assoc. Comp. Mach., 1971. 

57. H. Zassenhaus, On Hensel factorization, I, J. Number Theory, No. 1, 1 (July 1969) 291-311. 


ON THE DISCRETE VERSION OF WIRTINGER’S INEQUALITY 


O. SHISHA, Aerospace Research Laboratories, Wright-Patterson AFB, Ohio 
(Present address: Naval Research Laboratory, Washington, D.C.) 


1. Introduction. Wirtinger’s inequality [5, p. 185] states that if x(t) is a real 
function, absolutely continuous in [0,27], and satisfying 


x(0) = x(2z), i) (0) dt =0, 
then 


2% 22 
| x'*(t)dt 2 [ x?(t) dt, 
0 


0 


equality holding if and only if there are real constants A, B such that, throughout 
[0, 27], x(t) = A cost + Bsin t. 

It is natural to look for a discrete analog of this result. Such an analog is the 
following proposition: 


THEOREM 1. Let X1,X25°'s Xu» Xn¢1 =X, be reals with Li., x,=0. Then 


(1) E (ri— m2 4(sin? 2) Exp 
k=1 n 


k=1 


Equality holds if and only if there are real constants A, B such that 


756 O. SHISHA [September 
2 . | 2x 
(2) xX, = ACos = fk D + Bsin =k 2D , Kk=1,2,---,n. 


Theorem | readily implies and is implied by its following complex version: 


THEOREM 2. Let 21,225°**y Zn» Zn41 = Z1 be complex numbers with Yi_,z, =. 
Then 


(3) Dy | Z41- Z|? > 4 (sin =) > | z,.|?. 

k=1 N/} p=1 
If n > 2, equality holds if and only if (*) there is in the complex plane a regular 
N- gon Wy, W2,°°',W, Such that z, = F(w,), k= 1,2,:--,n, where F is some affine 
transformation of the plane into itself: 


Theorem 2 is due to I. J. Schoenberg [7]. Theorem 1 (along with other, related 
inequalities) was given, independently, by K. Fan, O. Taussky and J. Todd [3], who 
indicated also that it extends to the complex case as well as to more general spaces. 

Observe that if n = 1 or n = 2, equality clearly holds in (3). The same is true [7] 
if n = 3, because if w,, w2, w3 is any equilateral triangle in the complex plane, there 
exists an affine transformation F such that z, = F(w,) for k = 1,2,3. In particular, 
if n is 1, 2 or 3, equality holds in (1). 

The main purpose of the present article is to reprove Theorem 1 using some 
simple geometric facts which are of interest in themselves. 

Inequalities (1)-and (3) can be modified, replacing the differences in the left-hand 
sides by differences of higher order [1,6]. For the sake of completeness, we give, in 
Section 3, these generalizations. 


2. Some geometry. Our proof of Theorem 1 will be based on the following 
geometric result, in which d(a, b) denotes the distance between a and b: 


THEOREM 3. On a sphere S: d(c,x)=r in an N-dimensional (real) Euclidean 
space, let P,,P,-::,P,, (n 23) be points such that 


d(P,, P2) = d(P,,P3) = on A(P,—15 Pr) = a(P,, P;) = d*, 


and such that c lies in the convex hull of P,,P2,-::,P,. Then d* = a,, where a,, ts 
the length of a side of a regular n-gon inscribed in a circle of radius r, namely, 
2r sin(z/n). If c is an interior point of that convex hull (in the sense of the given 
N-dimensional space) and N > 2, then d* > a,. 


We postpone the proof of Theorem 3 to the end of this section. 
Proof of Theorem 1. We may assume that Dj-, x; >0, and n2=3. Let 


P, =(X1,X2,°°:,X,), and, for k = 2,3,---,n, let 


P, = (Xs Xn+15 "**sXny X15 X25 “1 Xp—4)- 


1973] ON THE DISCRETE VERSION OF WIRTINGER’S INEQUALITY 757 


In (real) Euclidean n-space E,, consider the sphere 


s:|x|=(2 xi). 

k=1 

Then the points P, lie on S, Xy=1 P, = (0,0,---,0), and 

d(P,,P,) = d(P2,P3)=+:- =d(P,-1,P,) = dP, P1) 


n 4 
= |= (X,417- x) 
k=1 


Therefore, by Theorem 3, (1) holds. 

We turn now to study the case of equality in (1). It is a straightforward exercise 
in analytical geometry that there exist real constants A, B satisfying (2) if and only 
if P,,P,,---,P, is a regular n-gon with center Q, the origin of E,. Hence, if there 
exist such A, B, then d(P,, P,) is the length of a side of a regular n-gon inscribed in a 
circle of radius (D?~,x7)*, and therefore (1) holds with equality sign. 

Conversely, assume (1) holds with equality sign. Let E be the subspace of E,, 
spanned by P,, P,,--:,P,. Then dim E, the dimension of E, is > 1, for otherwise 
we would have 


(4) 


DS Ori — x)? = @(P,,P2) =4| P, |? =4 2 x2 > 4(sin’ =) > x2 
k=1 k=1 Nn} ,=1 
Suppose dim E were > 2. Note that Q, as a point of E, is interior to the convex hull 
of P,, P2,-::,P,. For, otherwise, we could pass through Q a hyperplane H (of E) 
such that all P, lie in one and the same closed half space determined by H. Since 
nai Py = Q, all P, would lie on H, whose dimension is smaller than that of E. By 
applying Theorem 3 (in particular, its last sentence) to the space E and to the sphere 
S 7) E, we would reach (1) with strict inequality. Hence dim E = 2. Observe that for 
no k can we have P, = P,,,. Indeed, if n is odd, such an equality would lead to 


eX, =X2,=-:: =x,, while if n is even, we would obtain x,=x; =- =X,-4, 
X2=X,=+ =x, Since Li, x,=0, the first possibility contradicts Lj, 
x4 >; the second yields x, = —x,, hence P} =P; =---=P,-, = —P,=-—P, 


= += — P,, which implies dim E=1. From our assumption of equality in (1) 
and from (4) it now follows that P,, P,,---,P, is a regular n-gon with center Q. 
Hence there exist real constants A, B satisfying (2). 

For the proof of Theorem 3 we shall need another geometric result: 


THEOREM 4. In an n-dimensional (real) Euclidean space, consider a sphere 
S:d(c,x)=r. Then the length of every closed continuous curve lying on S and 
containing c in its convex hull is = 2xr. Ifn > 2 and c is an interior point (in the 
n-dimensional sense) of the convex hull of such a curve, then a strict inequality holds. 


Theorem 4, with n = 3, is due to W. Fenchel [4], and the proof we give is very 
imilar to his. 


758 O. SHISHA [September 


We proceed by induction. The theorem is easily seen to hold for n = 1 and n = 2. 
Suppose it holds for some n( 2 2). In an (n + 1)-dimensional (real) Euclidean space 
consider a sphere S: d(c,x) =r, and a closed continuous curve C lying on S and 
containing c in its convex hull. Among the finite subsets of C whose convex hull 
contains c, choose one, F, having a minimal number of points, say k. Then [2, p. 35] 
since C is connected, k < n+ 1. Thus, there exists a hyperplane H, with F cH, 
céH. Now, F lies on the intersection of S with H, a sphere S, with center c and 
radius r in the n-dimensional space H. By a proper replacement of arcs s of C joining 
pairs of points of F with arcs of great circles, we replace C by a closed continuous 
curve C,, lying on S,, whose length Lg, is S the length Le of C. Since FE C,, 
therefore c lies in the convex hull of C,, and so, by the induction hypothesis, 
Le, 2 2nr. Therefore Le 2 2zr. 

Assume that c lies interior to the convex hull of C. Suppose, first, that k = 2; say, 
F consists of-the (distinct) points a,,a,. If we had L ¢ = 2zr, then clearly we would 
have C = G, UG, where G, and G, are semi-great circles joining a, to a,. But then 
c cannot be interior to the convex hull of C. Hence L, > 2zr. Suppose now k > 2. By 
the minimum property of F, it cannot contain two (distinct) points collinear with c. 
At least one of the above mentioned arcs s is not a (shortest) arc of a great circle, for 
otherwise C would lie in H and, consequently, its convex hull would have no interior 
points. Hence Lp > Le, 2 2xr. This completes the proof. 


Proof of Theorem 3. We may Clearly assume that N22. The statement 
d* = a, is equivalent to the assertion that the (shortest) arc of a great circle on S 
joining P, and P, (or any other pair of consecutive P’s, including P,,, P,) has length 
= 2nzr/n. Multiplying both sides by n, we obtain the equivalent statement that a 
certain closed curve C joining P, to P,, P, to P3,---,P,-, to P,, and P,, to P,; has 
length = 2zr. But this curve lies on the sphere and contains c in its convex hull; 
therefore, by Theorem 4, the last inequality holds. Furthermore, if N > 2 and c lies 
interior to the convex hull of P,, P;,-:-,P,, then c is interior to the convex hull of C. 
Therefore, by Theorem 4, the length of C > 2xr which implies d* > a,. 


3. Differences of higher order. Given numbers X,,X;41,°°';Xj+p,P 29, Wwe set, 
as usual, 


s k{ P 
APx;= 2X (-1) (7) Xj+p—k 
k=0 


THEOREM 5. Let x1, X2,°-°,X, be reals with Lyp=1 xX, = 0. Define x, fork =n +1, 
n+ 2,--- so that (x;,)p=, will be of period n. Then for every integer p (= 0) we have 


(5) x (A?x,)* = 4? (sin =} x x. 
k=1 


k=1 


If p> 0, equality holds if and only if there are real constants A, B satisfying (2) 


1973] ON THE DISCRETE VERSION OF WIRTINGER’S INEQUALITY 759 


Proof. True for p = 0 (for which (5) always holds with equality sign). Suppose 
true for some p 2 0. We shall prove it for p + 1. 
Set x, = Ax, = X,44 — Xp, k =1,2,---,n. Then 


/ 
Xy = Xn41— X1 = 9. 
1 


Extend the definition of x, tok =n+1,n+2,--- so that (x;,)f~, will be of period n. 
Then x; = X,+1— , also fork =n+ 1, n+2,---. By the induction hypothesis, and 


by Theorem 1, 


TMs 


x (A?*tx,)? = XZ (A?x;)? 
k=1 k=1 


IV 


4P 2p it > (2 _ 4p 2 pt > 2 
sin’ x, = 4°| sin’ x (X,+1— X;) 


k=1 


IV 


nh 
, 1 
qptt (sine ) =} x xy. 


k=1 
Suppose there are real constants A, B such that (2) holds. Then 


xi = A’cos ea _ | + B’sin ac _ | , k=1,2,-4n, 


where A’, B’ are some real constants. Hence 


M = 


h 
5% 
(A?x;)? = 4? (sin?*= ) x x; 
k=1 


k=1 


and by Theorem 1, D7, x;/?= 4(sin?(z/n)) X"_,x/. Therefore 
y (AP**x,)? = 4P*} (sinter? =) y x/. 
k=1 N/] K=1 
Conversely, suppose the last equality holds. Then 
DL (X,p41— X)? = 4( sin? =) x xz 
k=1 Nf} =1 


and therefore, by Theorem 1, there are real constants A, B satisfying (2). 
From Theorem 5 one can deduce its complex analog: 


THEOREM 6. Let z,, Z2,°°',Z, be complex numbers with Ljpa, Z, = 0. Define z, 
fork=n+1,n+4+2,-:: so that (z,),2,1 will be of period n. Then for every integer 


p(=0) we have 


(6) y JAP, |?2:4°( sin?” =) y | z,|?. 


760 GARRETT BIRKHOFF [September 


If p> 0 and n > 2, equality holds if and only if (*) of Theorem 2 holds. 


As in the case of (3), equality always holds in (6) (and hence in (5)), if nis 1, 2 or 3. 
Of course equality always holds in (6), if p = 0. 


The author wishes to thank Professors D. Gale and D. J. Newman for their valuable suggestions. 


References 


1. H. D. Block, Discrete analogues of certain integral inequalities, Proc. Amer. Math. Soc., 8 
(1957) 852-859. 

2. H. G. Eggleston, Convexity, Cambridge Tracts in Mathematics and Mathematical Physics, 
No. 47, Cambridge University Press, 1958. 

3. K. Fan, O. Taussky, and J. Todd, Discrete analogs of inequalities of Wirtinger, Monatsh. 
Math., 59 (1955) 73-90. 

4. W. Fenchel, Uber Kriimmung und Windung geschlossener Raumkurven, Math. Ann.,101 
(1929) 238-252. 

5. G. H. Hardy, J. E. Littlewood, and G. Polya, Inequalities, 2nd edition, Cambridge University 
Press, 1952. 

6. A. M. Pfeffer, On certain discrete inequalities and their continuous analogs, J. Res. Nat. Bur. 
Standards Sect. B, 70B (1966) 221-231. 

7. I. J. Schoenberg, The finite Fourier series and elementary geometry, this MONTHLY, 57 (1950) 
390-404. 


CURRENT TRENDS IN ALGEBRA 
GARRETT BIRKHOFF, Harvard University 


1. Introduction. Symbolic algebra is much older than many mathematicians 
suppose; it can be traced back at least to Diophantus of Alexandria (ca. 250 A.D.) 
and Brahmagupta (ca. 598-665 A.D.). For this early work, see Cajori [7, Arts. 
101-5] and Ball [1, pp. 154-6]. Even so-called ‘‘modern”’ algebra is over a century 
old! 


Garrett Birkhoff is the Putnam Professor of Pure and Applied Mathematics at Harvard, where 
he did his undergraduate and graduate work, was a Junior Fellow in the Society of Fellows, and has 
served on the faculty since. He has been a Visiting Lecturer at the University of Washington, Univer- 
sity of Cincinnati, and the National University of Mexico, and held a Guggenheim Fellowship. He 
has served as President of SIAM, Vice-President of the AMS, the MAA, and the American Academy 
of Arts and Sciences, and Chairman of the CBMS. He is a member of the American Philosophical 
Society and the National Academy of Sciences, and has received honorary degrees from the National 
University of Mexico, the University of Lille, France, and the Case Institute of Technology. 

His extensive publications in modern algebra, fluid mechanics, numerical analysis, and nuclear 
reactor theory include the books Hydrodynamics (Princeton University Press, 1950); Lattice Theory 
(American Mathematical Society Colloquium Publications, 1940, Third Edition 1967); Survey 
of Modern Algebra (with S. Mac Lane, Macmillan, 1941, 1953, 1965); Jets, Wakes, and Cavities 
(with E. H. Zarantonello, Academic Press, 1957); Ordinary Differential Equations (Ginn, 1962); 
Algebra (with S. MacLane, Macmillan, 1967); Modern Applied Algebra (with T. C. Bartee, McGraw- 
Hill, 1970). Editor. 


1973] CURRENT TRENDS IN ALGEBRA 761 


When you realize this, you should not find it too hard to believe that the availabili- 
ty of high-speed computers is giving rise to new trends in algebra. My ultimate aim is 
to sketch for you, in §§8—10, what I think these trends are. But I wish to lead up to 
this theme by a brief résumé of the development of algebra as we know it today, 
over the past several centuries. 


2. Classical algebra. The name modern algebra was originally intended (in 
1930) to signify a contrast with classical algebra, which was generally understood 
to mean the theory of equations. This may be defined as the art of solving numerical 
problems by manipulating symbols, and seems to have originated with Al-Khwarizmi 
and other Islamic mathematicians during the period 800-1000 A.D. As we know, its 
most essential idea consists in replacing each verbal statement about numerical 
quantities by a symbolic equation, whose terms can be rearranged and combined by 
well-established general laws to give a sequence of equivalent but, hopefully, simpler 
equations. The original equation can be considered as ‘‘solved’’ when the unknown 
quantity has been isolated on one side of the equality symbol =, on the other side 
of which is some expression involving only known quantities. 

Though the word “‘root”’ (of an equation) can be traced back to the Sanskrit,'’ 
and the word ‘‘power’’ (of a number) appears in al-Khwarizmi’s Algebra (‘‘al- 
jabr’’), development of classical algebra in its present form was very gradual. The 
‘‘al-jabr’’ of the Arabs did not become widely used in Western Europe, and the 
symbols + and — did not achieve their present significance, until nearly 1500. A 
major advance followed shortly thereafter, the solution of cubic and quartic 
equations by radicals being already contained in Cardan’s Ars Magna (1545). 

For the next two centuries, progress in algebra was mainly*® in connection with 
its applications: to (analytic) geometry, which gave a vivid meaning to negative 
numbers, and to the calculus through the use of infinite series. Until after 1750, the 
significance of imaginary roots and complex numbers remained quite obscure, and 
even discussions of simultaneous linear equations and determinants were un- 
systematic and fragmentary. 

But from 1750 to 1830, thanks especially to the work of Euler, Lagrange, and 
Gauss, Classical algebra developed rapidly into approximately its present form. 
Thus the exponential function e? became defined for all complex z as a power series, 
and as a result a” = e”'"* became well-defined for all positive a and complex z. 
The ‘‘Lagrange resolvent’’ was also invented by Euler [ 13, p. 27]. 

Above all, the Fundamental Theorem of Algebra was recognized as such, clearly 
stated, and proved. Euler considered its real forms, whose equivalence is easily shown. 
Two of these are: 

(a) Every real polynomial of degree n > 2 has proper factors. 
(b) Every real polynomial can be (uniquely) factored into real linear and quadratic 
factors. 


1. All notes are collected together at the end of the paper. 


762 GARRETT BIRKHOFF [September 


Condition (a) follows for n <5 from Cardan, and for n = 5 because every real 
polynomial of odd degree has real roots. Euler satisfied himself that it was true for 
all n, but his proof is obscure. 

Conditions (a) and (b) are easily shown to be equivalent also to the usual state- 
ment of the Fundamental Theorem of Algebra: 

(c) Every complex polynomial can be factored into linear factors. 

Gauss gave many relatively rigorous proofs of (c) from about 1800 on, and made 
it clear that all polynomial equations had solutions in terms of complex numbers 
x+ yi, i= J -1, while the geometrical interpretation of complex numbers as 
points in the (x, y)-plane gave them a more than symbolic meaning. 

Gauss also developed systematic iterative as well as elimination techniques for 
solving systems of simultaneous linear equations, and the laws of determinants also 
became generally known —all by 1825 or so. 

A few years later, Galois and Abel showed that it was impossible to solve a 
general equation of the fifth degree by radicals*, and after this mathematicians 
gradually began to turn their attention from the theory of equations to non-numerical 
applications of symbolic algebra (e.g., to groups, vectors and matrices). 


MODERN ALGEBRA TO WORLD WAR I 


3. ‘‘Modern’’ algebra to 1860. As a result, although real and complex algebra 
dominated the textbook literature for a full century after 1830, “‘modern”’ algebra 
had already achieved some notable successes by 1860. 

Actually, already by 1770, Lagrange was interested in the ‘‘symmetric group’’ of 
all permutations of n letters and its subgroups, whose relevance to the solution by 
radicals of the general polynomial equation 


(1) M$ ax" ho ba, = (8 — H)(H— Hy) (KH) =O, 


he clearly recognized. A by-product of this interest was the Lagrange Theorem, that 
the order of any subgroup of a group G divides the order of G. 

Ruffini, Galois, and Cauchy made further contributions to the development of 
group theory before 1845 [ 13, pp. 45-53]; Galois also made (in 1830) a fundamental 
contribution to the theory of fields, by constructing a finite field of each prime-power 
order p’. (For formal definitions of groups and fields, see §3.) 

Somewhat earlier Legendre and Gauss (1801) had initiated the study of com- 
mutative rings, by constructing the ring Z, of the integers ‘‘modulo”’ n (i.e., in which 
integral multiples of n are set equal to zero) and the ring Z[i] of all ‘‘Gaussian 
integers”” m-+ni, where m,neéZ are ordinary integers” and again i = J — 1. 
Moreover Gauss, had proved that factorization into primes was unique in Z[ i]. 

By 1850, noncommutative rings were also being studied. Thus Hamilton in- 
troduced his quaternions in 1843; since they contain the complex numbers as a 
special case, they may be called hypercomplex numbers. And in the first edition of 


1973] CURRENT TRENDS IN ALGEBRA 763 


his book Ausdehnungslehre (1844) H. Grassmann discussed both vector algebra 
(a fairly natural generalization of Descartes’ symbolic method for treating geometry) 
and, somewhat vaguely, hypercomplex numbers in general. These concepts were 
made much more precise (and their connections with n-dimensional geometry 
clarified) by Cayley; by Hamilton in the Preface to his book on Quaternions (1859); 
and by Grassmann in the second edition of his book (1878). Moreover, Cayley 
showed in 1858 [13, p. 84] that the theory of determinants of Vandermonde and 
Laplace was only one aspect of a much more powerful matrix algebra. Matrix 
algebra is much like ordinary algebra, except that for general matrices A and B, 
AB # BA: the multiplication of matrices, like that of quaternions, is non-commutative. 
Indispensable for all pure and applied mathematicians today, matrices were first 
introduced formally by Cayley in 1858, and gradually revolutionized linear algebra.° 

Shortly before, two other novel areas of modern algebra had been opened up. 
In 1854, Boole had published his Introduction to the Laws of Thought, in which he 
showed that a substantial part of Aristotelian logic was described by an analog of 
ordinary algebra now called ‘‘Boolean algebra.’’ This novel ‘‘algebra of logic’”’ 
satisfied not only most of the laws of ordinary algebra, but also the curious identities 


a* = a+a=a (which today would be written a \ a=a,v a= a), 
(a+ b)a=a, and (a+ b)(a+c)=a+ be. 


4. The axiomatic approach. We have just seen that many of the major branches 
of so-called ‘‘modern algebra” (rechristened ‘“‘the new math” by the popular press 
in the post-Sputnik era) were already known to mathematicians by 1860. However, 
the axiomatic approach to the foundations of algebra did not come until later. 
Lagrange derived the Lagrange theorem for groups and Galois constructed Galois 
fields without ever thinking of groups or fields as defined by postulates at all; their 
assumptions were entirely intuitive! Even the names ‘‘commutative’’ and ‘‘dis- 
tributive’’ for the corresponding laws of manipulation were not introduced (by 
Servois) until 1814,’ nor the term ‘‘associative’’ (by Hamilton) until 1835. 

The emancipation of algebra from exclusive concern with the real and complex 
fields owes much to the philosophical speculations about algebra of Peacock, 
Woodhouse,® Hamilton, de Morgan, Boole, and Cayley, but E. T. Bell’s claim 
[3, pp. 180-1] that it was Peacock who: ‘“‘first perceived common algebra as an 
abstract hypothetico-deductive science of the Euclidean pattern’’ goes too far. 
Though Peacock anticipated Hankel in announcing the ‘‘principle of permanence of 
equivalent forms,’’ his ‘‘Symbolical Algebra’’ is mainly concerned with geometrical 
applications, and does not even mention axioms or postulates. In these qualities it 
resembles H. Grassmann’s Ausdehnungslehre (1844).? 

The role of axioms emerges much more clearly from the Formenlehre of R. 
Grassmann (1872); the Operationskreis der Logikkalkul of E. Schroder (1877); the 
axiomatic treatments of groups, fields, modules, and ideals by Cayley (1878), Frobenius 


764 GARRETT BIRKHOFF [September 


and Stickelberger (1879), Dedekind,'® Weber (1882, 1893), and E. H. Moore; and 
the independent contemporary work of Benjamin Peirce and his son, C. S. Peirce, 
at Harvard (1870-1881).*! 

Influenced by these writings, Peano’? initiated in 1888 his axiomatic approach to 
arithmetic, about which I shall say much more later. A decade later, in his Grundlagen 
der Geometrie [9], Hilbert tried to improve on Euclid. He succeeded from the 
standpoint of rigor, but not from that of pedagogy! Perhaps his most fundamental 
contribution to axiomatics was his clear formulation of the notions of independence, 
consistency and completeness for axiom systems. 

In 1902, E. H. Moore showed that Hilbert’s own axioms were not independent, 
and during the next ten years E. V. Huntington, L. E. Dickson, and O. Veblen made 
other painstaking analyses of the independence of postulate systems for groups, 
fields in general, the real and complex fields in particular, the algebra of logic, and 
the foundations of geometry. One can get an excellent picture of this work by reading 
the papers by Moore and Huntington;!° for a more colorful if less reliable survey, 
see [2, Ch. 3]. : 

Partly as a result of such papers, the postulational approach to algebra finally 
became standard. Mathematicians found that amazingly few and simple postulates, 
many fewer than those of Euclidean geometry,'* could provide a sufficient basis 
for very extensive algebraic theories. For example, all of group theory can be derived 


from general principles of logic and the following postulates, due to E. V. Hun- 
tington (1906). 


DEFINITION. A group G is a set of elements (to be denoted by small Latin letters), 
any two of which, say x and y, have a product xy which satisfies the following 
conditions: 

Gl. Multiplication is associative: x(yz) = (xy)z for all x, y,zeEG. 

G2. For any two elements a,beG, there exist x,yeG such that xa =b and 
ay = b. 

(We have used Peano’s notation xéG above; it signifies that ‘‘the element x is a 
member of (belongs to) the set G.’’) 


Ingenious arguments can be used to deduce from these postulates various other 
simple conditions, for example, that: (i) any group G contains a unique ‘‘idempotent”’ 
element e satisfying ee = e, (ii) this element satisfies ex = xe = x for all xe G (acts 
as an “‘identity”’ for G), (iii) the elements x and y in G2 are uniquely determined by 
a and b, and so on. 

Similarly, the entire theory of fields can be deduced from the following set of 
postulates, also due to Huntington. 


DEFINITION. A field is a set F of elements, any two of which have a sum x+y 
and a product xy which satisfy the following conditions: 
Fi. Addition and multiplication are commutative: 


1973] CURRENT TRENDS IN ALGEBRA 765 


x+y=y+x and xy= yx for all x, yeF. 
F2. Addition and multiplication are associative: 
x+(y+z)=(x+ y)4+2z and x(yz) = (xy)z, all x, y,zeF. 
F3. Multiplication is distributive on sums: 
x(y+z)=xy+xz for all x,y,zeF. 


F4. For any a, be F, there exists some xe F such thata+x=b. 

F5. If a+a#a, then there exists some yeF such that ay=b. (Actually, 
Huntington weakened FS by adding the condition b + b # b to its hypothesis.) 

(Of course, the hypothesis a + a # a is just an indirect way of assuming that a 4 0, 
necessary here because Huntington wanted to avoid assuming the existence of a 
‘“‘zero’’ 0 in F.) 

The postulational approach to algebra, combined with an awareness of the 
relevance of all kinds of algebraic systems, stimulated an interest in enumerating all 
possible algebraic systems satisfying specified conditions: all finite fields (Galois 
had found them all), all groups of given order n, and so on. In this enumeration, one 
must of course identify all groups (or fields) which are isomorphic, that is, whose 
elements are related by a bijection which preserves group multiplication (in fields, 
which preserves addition and multiplication). Such a bijection is called an 
isomor phism. 


5. Morphisms and subalgebras. More generally, it is helpful to know when two 
algebraic systems A and B are related by a (homo)morphism, or mapping 0: A> B 
which preserves all their defining operations. Finally, it is helpful to recognize the 
subalgebras of A, i.e., the subsets S of A which satisfy all postulates; under these 
circumstances, A is conversely called an extension of S. (Thus the complex field is 
an extension of the real field.) To test for being a subalgebra, it is usually sufficient 
to test for closure with respect to suitable operations. In a group, for example, a 
subgroup must contain: (i) the identity, (ii) with any x also x~1; and (iii) with x and 
y also xy. In fields, one must require closure under addition, subtraction, multiplica- 
tion, and division. 

The preceding concepts apply to all of the usual kinds of algebraic systems; I shall 
come back to them in my next lecture. 


6. Some deeper developments: 1860-1914. During the same decades that its 
foundations were being clarified by the postulational method, the scope and depth 
of algebra grew enormously. I can only indicate very sketchily a few especially 
remarkable results here. 

First, Galois theory became clarified as follows; I shall stick to extensions of the 
rational field Q to fix ideas, but the results generalize to extensions of any field. 
Let F = Q[x,,---,x,] be the field generated by the roots of a polynomial 


766 GARRETT BIRKHOFF [September 


pix) = (x — x4) (% — x) (8 — x) = x Faye" + $a, =0 


with coefficients a,éQ. Define the Galois group F(f: Q) of F (and of p(x)) over Q 
to be the group of all automorphisms « of F such that «(x) = x for all xeQ. Then 
the theorem of Galois states that the equation p(x) =0 is solvable by radicals in 
terms of the coefficients (i.e., over Q) if and only if the Galois group G(F: Q) is 
“‘solvable’’ in the following sense. 


DEFINITION. Define a composition series of a finite group G to be a chain of 
subgroups of G, 
1<S,<S,<-:-<S,=G, 


each of which is a maximal normal subgroup of the one following. Form the as- 
sociated quotient-groups S,/S,—,. Then G is called solvable when these quotient- 
groups are all Abelian (it is equivalent that they all be of prime order). 

Second, pure group theory acquired depth. Among the many remarkable theorems 
about finite groups proved in the half-century 1860-1914, I shall mention only a few. 
First, it was shown that the set of S,/S,_, is the same (up to isomorphism and 
rearrangement) for all composition series (Jordan-Hélder Theorem). Again, it was 
shown that any group of prime-power order p” is solvable. Finally, it was shown 
that if p” divides the order of a group G, then G has a subgroup of order p” (Sylow 
theorem). 

Third, in the area of algebraic number theory, Dedekind developed ideal theory, 
and applied it to generalize the pioneer result of Gauss on the unique factorization 
of Gaussian integers, to a sweeping unique factorization theorem for any algebraic 
number field (i.e., any subfield of the complex field C having finite linear dimension 
over the rational subfield Q). Namely, he showed that factorization into prime ideals 
is unique.!> 

Dedekind’s deep interest in ideal theory and in unique factorization into primes 
also led him to consider the operations of greatest common divisor (g.c.d.) and least 
common multiple (1.c.m.) from a postulational standpoint. Recognizing their analogy 
with ‘‘and’’ and ‘‘or’’ in Boolean algebra, he was led to develop and apply the 
elementary theory of lattices (‘“‘Dualgruppen’’), modular lattices, distributive 
lattices, and vector lattices in two pioneer papers (1897, 1901), thus founding a 
major new branch of algebra which contained Boolean algebra as a special case. 


7, Linear associative algebras. In 1870, at about the same time that Dedekind was 
developing ideal theory into a powerful tool, Benjamin Peirce of Harvard made a 
pioneer study of the systems of ‘‘hypercomplex numbers’’ vaguely adumbrated by 
Grassmann, Hamilton and Cayley. Peirce began by defining a ‘‘linear algebra’”’ over 
a field F as a set A whose elements are arbitrary linear combinations 


(2) a= (d4,°"',d,) = ayiy +e + a,l, 


of r basis elements i,, multiplied by some rule of the form 


1973] CURRENT TRENDS IN ALGEBRA 767 
(2’) a*b = (2a)b,,)it* tn = 20, nPimnlns 


the constants y,,,, can be any scalars (elements of F). He called a linear algebra 
associative when the multiplication defined by (2’) is associative. 

A very notable linear associative algebra is provided by Hamilton’s quaternions, 
which have four basic elements 1, i, j,k and hence 64 constants (mostly zero) defined 
by the rules 


(3) 1-a=a:‘l=a for all a, i? = j*? =k? = - 1, 
(3°) i-j=—-jisk, jk=—-k-j=i, k-i=—i-k=j. 


The identities of (3’) are clearly those for vector products. The quaternion algebra 
R[i, j,k] over the real field is also a division algebra: any nonzero quaternion 
a= dy) + a,i+ a,j + a3;k 4 0 has an inverse, given by 


(3”) a~! = (ay — a,i— a,j — a3k)/(aj + aj + a? + a?). 

Peirce’® showed that the complex numbers and the quaternions formed the only 
hypercomplex division algebras over the real field. 

Even more important is the full matrix algebra M,(F) of all n?-matrices 
A = || dim || = Xain€im. The basis elements e,,, of M,(F) are multiplied by the rules that 
Crm’ if m= I’ 


(4 Cimet'm’ = 
im 0 otherwise. 


Hence, the constants are given (in a slightly changed notation) by 


4! 1 if V’=ml’=lm’=m' 
a Mm,’ f otherwise. 

From 1870 on, many mathematicians tried to classify linear associative algebras 
over the real and complex field, using the Fundamental Theorem of Algebra as a tool 
where convenient. Papers by Frobenius (1878, 1903), Molien (1893), and Cartan 
(1898) were especially noteworthy.’ 

In a remarkable paper published in 1907, Wedderburn showed that most of the 
structure theorems of Cartan and Frobenius could be proved for linear associative 
algebras over an arbitrary field! In particular, he proved the following basic results, 
whose precise meaning will be explained below. For further details, see [3, Ch. 11]. 
Wedderburn himself stated that ‘‘Most of the results contained in the present paper 
have already been given, chiefly by Cartan and Frobenius, for algebras over the 
rational field.”’ 

(i) Any linear associative algebra is the direct sum (in the vector space sense) of a 
*“‘semisimple’’ subalgebra and a unique “‘nilpotent’’ invariant subalgebra; 

(ii) The semi-simple summand in (i) is the direct sum of ‘“‘simple’’ linear as- 
sociative algebras, in a unique way; 


768 GARRETT BIRKHOFF [September 


(iii) Each simple summand in (ii) is, for some n, the ‘‘full matrix algebra’’ 
M,(D) of all n x n matrices A = || a;;|| with entries a;; in a suitable “division algebra” 
D over F, the field of scalars of the original linear associative algebra. 

To explain (i), we recall that a linear algebra is called ‘‘nilpotent’’ when, for some 
finite integer n, all products a,a,---a, of n factors vanish. A ‘‘subalgebra’’ of an 
algebra is a subset closed under addition and multiplication (as well as linear 
combination over the field of scalars); such a subalgebra K is called ‘‘invariant’’ 
when k eK implies aké K, and kaeéK for any element a, even if not in K; this is 
the condition that K be an ideal in the sense of ring theory. 

Not all linear algebras are associative. The most important family of non- 
associative algebras is the family of Lie algebras. In these, the associative law 
is replaced by the following three identities: 


[aa] =0, [ab] + [ba] =0, [[ab]c] + [[bc]a] + [[ca]b) =0, 


true for all a, b,c. In the 1870’s, Lie had shown that real and complex Lie algebras 
provided the key to the understanding of continuous groups based on a finite number 
of parameters. It was therefore most remarkable that Killing (1888-1890) and 
Elie Cartan (1894), were able to prove that Lie algebras satisfied structure theorems 
somewhat analogous to those for associative algebras stated above —and to deter- 
mine all “‘simple’’ Lie algebras over C. This work of Killing and Cartan on the 
structure of Lie algebras came before the analogous work of Molien and Cartan on 
linear associative algebras *7* 

One can, perhaps, summarize the preceding developments in the statement that 
more was known about “‘modern algebra’’ by research algebraists in 1914 than 
most Ph.D’s know today. However, algebra was still regarded as subordinate to 
classical analysis, and the complex field reigned supreme. Thus, of the two advan- 
ced texts on algebra (as distinguished from number theory) most widely used in 1900, 
Weber’s began with a chapter on algebraic functions and Serret’s with] one on conti- 
nued fractions! 


8. Symbolic logic to Godel. In retrospect, it seems not too surprising that the 
dramatic successes of 19th century algebraists and logicians should have encouraged 
some imaginative mathematicians to develop a symbolic logic which would reduce 
all theorem-proving to mechanical symbol manipulations according to prescribed 
rules or ‘‘axioms.”’ Actually, this idea goes back at least to Leibniz, who envisioned 
around 1700 symbolic methods capable of ‘‘increasing the power of reason far more 
than any optical instrument has ever aided the power of vision.’’ To his fertile mind, 
the powerful symbolic algebra of the differential and integral calculus (much of 
which he invented) must have seemed a direct confirmation of the potentialities of 
symbolic methods, 

The symbolic approach was developed tremendously by Peano from 1889 on. 
His main contributions to it may be found in his Formulario Matematico (5th ed., 


1973] CURRENT TRENDS IN ALGEBRA 769 


1908), whose preface states that: ‘‘All progress in mathematics is in response to the 
introduction of symbols (ideographic signs). ... Among two symbolic systems, the 
one with fewer symbols is, in general, preferable. But the fundamental use of 
[symbolic methods] is to facilitate calculation.’’ The preface continues with a 
review of the origin of various symbols, including +, x, D (derivative), {, and those 
for vector and Boolean algebra. It then proposes for general adoption the symbols 
€ (for membership) and 4 (there exists). Peano claims that with these, and a handful 
of other symbols and symbolic conventions, all mathematics can be presented in 
symbolic form.*® 

Actually, Peano was not the first mathematician to conceive of a purely symbolic 
mathematics. In 1634, Hérigone had written in the Preface of his Cursus Mathe- 
maticus: ‘‘I have invented a new method of making demonstrations, brief and 
intelligible, without the use of any language,’’ and his symbolic style was adhered to 
by Wallis (1656) and Barrow (1655, 1660).+? 

Peano then substantiates his claim by 386 pages of text containing symbolic 
synopses of: (1) Mathematical Logic, (2) Arithmetic, (3) Algebra, (4) Vectors (“‘Geo- 
metry’’), (5) Limits, (6) Derivatives, (7) Integrals. Most successful are Parts (1) and 
(2); the latter contains Peano’s celebrated construction?® of the nonnegative in- 
tegers by his ‘‘successor function’’: 1 =0+,2=1+,3=24,---, and his derivation 
of the laws of arithmetic from it is superb. In 70 additional pages, Peano extends 
his purely symbolic treatment to plane curves, differential equations, and various 
other topics. 

However, Peano’s Formulaire must be viewed as primarily a thought-provoking 
tour de force, in spite of its wealth of ideas and insights. Nowhere does he list the 
rules of symbol-manipulation for passing from one formula to the next; he fails to 
provide a system of axioms for logic. His proofs, like Euclid’s in geometry, can 
only be verified by attributing meanings to words. 

This major gap was filled by Whitehead and Russell in their three volume 
masterpiece Principia Mathematica [18]. Here they specified carefully the symbol- 
manipulations (‘‘rules of inference’’) which can be used infallibly in passing from 
hypotheses to conclusions in symbolic logic (mathematical reasoning). 

Using their specified rules of inference as ‘‘axioms’’ for symbolic logic, Whitehead 
and Russell showed that one can paraphrase symbolically at least the construction 
of the real field R from the positive integers, as well as much of set theory and 
arithmetic. These major achievements were presented as empirical evidence support- 
ing the thesis that all mathematical theorem-proving can be reduced to mechanical 
symbol-manipulations (i.e., to pure symbolic logic).** 

Nobody disputes the claims of Whitehead and Russell, that their rules of in- 
ference for ‘‘Peanese’’ (the symbolic language of Peano), are (i) infallible subject to 
restrictions stated in English in their text, and (ii) sufficient for much of elementary 
mathematics. However, the actual mathematical coverage of Principia (in nearly 


770 GARRETT BIRKHOFF [September 


2000 pages!) is far less than Peano’s, and it cannot be said that their symbolic methods 
used “increase the power of reason;’’ I think they decrease it, probably for psy- 
chological reasons.” : 


9, Hilbert and Godel, 1918-31. Because of its capability of replacing special 
axioms for the different branches of mathematics by theorems (cf. Principia 
Mathematica, Preface, first paragraph), Hilbert said in 1918 that ‘‘Russell’s 
Axiomatization of Logic is the crowning achievement of axiomatics.’’?> And Hilbert 
spoke with authority, as the man who had rigorized the axioms of Euclid in his 
famous Grundlagen der Geometrie only 20 years before. I quote from the in- 
troduction to this book: 

‘““Geometry —like number theory—requires for its deductive (folgerichtige) 
construction only a few basic theorems (Grundsdtze). These theorems are called 
axioms,?* and their connected development has had numerous treatments since 
Euclid .... The following book is a new attempt to develop the simplest possible 
complete axiom system for geometry ...so as to clarify the significance of the different 
groups of axioms and the consequences of the individual axioms.”’ 

In much the same spirit of axiomatic analysis, Hilbert and his collaborators, 
especially his co-authors W. Ackermann and P. Bernays,?> made after 1918 major 
efforts to prove deductively (by metamathematical arguments) the adequacy of the 
axioms of Whitehead and Russell (the evidence of Principia Mathematica was 
empirical). They focused attention on two main questions: 

(i) Are these axioms contradiction-free, i.e., using them, is it impossible to prove 
both p and its contradiction ~p? 

(ii) Can one test the truth or falsity of any given proposition (e.g., of arithmetic) 
in a finite number of steps? 

Hilbert may have been attracted to these questions partly because he had 
established an analog of the first in his Grundlagen der Geometrie, by using Cartesian 
geometry as a model for Euclidean plane geometry, and of the second for polynomial 
ideal bases by general transfinite arguments using the ‘‘ascending chain condition.”’ 

Question (i) was given a positive answer by Ackermann (who had earlier proved 
the redundancy of one of the Whitehead-Russell axioms) and von Neumann in 1927, 
under suitable restrictions. These restrictions, which are quite technical,?° seemed 
quite harmless at first sight, and led to a feeling of optimism about Hilbert’s program 
in the years 1927-1930. 

Question (ii), the Entscheidungsproblem or Decidability problem, was however 
given a negative answer, even for arithmetic propositions, by Gédel in 1931. By an 
ingenious use of metamathematical reasoning, ultimately based on Cantor’s diagonal 
construction, he inferred from this undecidability the incompleteness of the Whitehead- 
Russell-Hilbert system in the following sense. Assuming as true the additional 
consistency axiom, that ‘‘false formulas are unprovable,’’ one can prove a number- 
theoretic formula which would not be provable without it. It is a corollary that one 


1973] CURRENT TRENDS IN ALGEBRA 771 


cannot prove that Hilbert’s axioms are contradiction-free, so that in particular 
Question (i) is undecidable. 

Thus Gédel’s paper shattered Hilbert’s high hopes. To quote Hermann Wey!?’: 
‘‘Gédel enumerated the symbols, formulas, and sequences of formulas in Hilbert’s 
formalism in a certain way, and thus transformed the assertion of consistency into 
an arithmetic proposition. He could show that this proposition can neither be 
proved nor disproved within the formalism. This can mean only two things: either 
the reasoning by which a proof of consistency is given must contain some argument 
that has no formal counterpart within the system, i.e., we have not succedeed in 
completely formalizing the procedure of mathematical induction; or hope for a 
strictly ‘finitistic’ proof of consistency must be given up altogether. When Gentzen 
(1936) finally succeeded in proving the consistency of arithmetic he trespassed those 
limits indeed by claiming as evident a type of reasoning that penetrates into Cantor’s 
‘second class of ordinal numbers’.’’ 

Gédel’s result ended abruptly a half-century of optimism about symbolic logic, 
at least as formalized by Peano, Whitehead, and Russell. It showed that their for- 
malizations were incapable of resolving the paradoxes and ambiguities of Cantor’s 
theory of infinite sets.?° 


THE REIGN OF MODERN ALGEBRA, 1930-1970. 


10. The rise of ‘‘modern’’ algebra. Just before Gédel shattered the high hopes of 
symbolic logicians for formalizing all mathematics in terms of ‘‘Peanese’’, van der 
Waerden’s Moderne Algebra (1930-31) precipitated a new revolution. The goal of 
this brilliantly written book is clearly stated in its preface. 

‘‘The ‘abstract’, ‘formal’ or ‘axiomatic’ direction, which has given to algebra 
renewed momentum,?” has above all led to a series of new concepts in group theory, 
field theory, valuation theory, and the theory of hypercomplex numbers, to insight 
into new connections and to far-reaching results. The main aim of this book is to 
introduce the reader into this new world of concepts.”’ 

As I have indicated, both the axiomatic approach and much of the content of 
‘‘modern’’ algebra dates back to before 1914. However, even in 1929, its concepts 
and methods were still considered to have marginal interest as compared with those 
of analysis in most universities, including Harvard. By exhibiting their mathematical 
and philosophical unity, and by showing their power as developed by Emmy Noether 
and her other students (most notably E. Artin, R. Brauer, and H. Hasse), van der 
Waerden made ‘‘modern algebra’’ suddenly seem central in mathematics. It is not 
too much to say that the freshness and enthusiasm of his exposition electrified the 
mathematical world— especially mathematicians under 30 like myself. 

In particular, it made classical real and complex algebra seem passé, or at least 
a part of analysis and not of ‘‘algebra’’ in the true sense. This view is exemplified in 


772 GARRETT BIRKHOFE [September 


Moderne Algebra, where the real and complex fields are not even defined until after 
Galois theory has been presented, and the existence and uniqueness of a smallest 
algebraically closed extension of any field (Steinitz, 1910) are proved purely al- 
gebraically (by transfinite induction). What a contrast with the texts of Weber, 
Serret, and Perron! - 


11. Lattice theory. This new attitude was a major stimulus in the rebirth of lattice 
theory, which had lain dormant since the pioneer papers of Dedekind. In 1933, I 
wrote that lattice theory provided “‘a point of vantage from which to attack com- 
binatorial problems in --- abstract algebra.”’°° And by 1938, enough progress had 
been made in applying it to logic, algebra, geometry, probability, measure and 
integration theory, and functional analysis to cause the American Mathematical 
Society to hold a symposium on the then very fresh subject.*! 


12. College algebra. The displacement of classical algebra by modern algebra 
took time. Thus it was not until after World War II that modern algebra became 
popular at the college level in our country—a popularity due partly to the Survey 
of Modern Algebra which Mac Lane and I had published in 1941. Actually, our 
approach seems quite conservative today! Thus, unlike van der Waerden, we pre- 
sented the essentials of the theory of equations before defining groups, and the theory 
of real and complex matrices (including the Principal Axis Theorem for symmetric 
and Hermitian matrices) with geometric applications before Galois theory. We also 
included Boolean algebra, thinking it essential for students to understand the algebra 
of sets and logic; I shall return to this later. 


13. Bourbaki’s influence. Abstract mathematics, as reformulated by N. Bourbaki>? 
in his Eléments de Mathématique, was popularized in French universities not long 
after. This many-volume treatise, mostly written in the decade 1945-55, attempts to 
develop all of (pure) mathematics systematically from the notions of set and function: 
it presents the content of mathematics as concerned with abstractly conceived relation- 
al structures over sets and mappings (especially morphisms) between them; cf. 
Book 1, Ch. 4. 

Algebraic structures are treated in this spirit in Book 2, as defined by sets of 
elements with (internal or external) finitary operations. The reader is then led 
authoritatively and surely through a carefully polished and systematic sequence of 
definitions, examples, and theorems about groups, rings, fields, and most of the 
other kinds of systems I have mentioned. Other branches of mathematics are treated 
in much the same style in later books. The net effect is to make mathematics appear as . 
a polished monolith, .built purely deductively from the notions of set and function. 


14, The flowering of abstract algebra. The enthusiasm generated by van der 
Waerden’s book, reinforced in the ways that I have described, has given rise to an 
unprecedented flowering of all aspects of abstract algebra over the past 40 years. In 


1973] CURRENT TRENDS IN ALGEBRA 7173 


particular, the theories of groups, rings and fields (to which the bulk of Moderne 
Algebra was devoted) have achieved new levels of depth and sophistication, of which 
perhaps the most dramatic example is the result that every finite group of odd order 
is solvable. This result, proved by Thompson and Feit in over 200 pages of very 
technical reasoning, had long been conjectured — but to prove it would have seemed 
hopeless to most mathematicians in 1930. 

The last 40 years have also seen the theories of Lie, Jordan, and multilinear 
algebras mature to a point that makes what was known in 1930 seem amateurish if 
not naive. The same is true of lattice theory, semigroup and quasigroup theory, 
category theory, and homological and combinatorial algebra, all of which were 
either unknown or nearly so in 1930. Finally, algebraic geometry has become rigo- 
rized as a new branch of axiomatic algebra, based securely on deep results about 
Commutative rings and their ideals and valuations.*°* 


15. Wider repercussions. The tidal wave generated by enthusiasm about abstract 
algebra had wider repercussions. Thus to young men in 1930, like myself, van der 
Waerden’s book made classical analysis stemming from the calculus (‘‘analyse 
infinitésimale’’), which had dominated mathematics for over two centuries, suddenly 
seem old and tired. Indeed, the abstract approach adopted by van der Waerden for 
algebra soon became fashionable in functional analysis and topology. The idea that 
all mathematics could be viewed as topological algebra gained a strong impetus 
from the solution of Hilbert’s Fifth Problem, which showed that the hypothesis of 
differentiability could be replaced by mere continuity in the theory of Lie groups: any 
locally Euclidean continuous group is isomorphic to an analytic Lie group | 22, p. 184]. 
Even research on partial differential equations, the traditional stronghold of the 
applied mathematician, has increasingly centered around the quest for new abstract 
concepts permitting one to prove extremely general existence and uniqueness theorems. 

Partly because of such shifts in emphasis, by 1960 most younger mathematicians 
had come to believe that all mathematics should be developed axiomatically from 
the notions of set and function, and this approach had come to seem no longer 
modern but classical! By 1959, van der Waerden had changed his title from “‘Moderne 
Algebra’’ to ‘‘Algebra.’’ And in the 1960’s, Mac Lane and I wrote another ‘‘Algebra”’ 
which went further in the direction of abstraction, by organizing much of pure algebra 
around the central concepts of morphism, category, and ‘‘universality.’’ The “‘univer- 
sal’? approach to algebra, which I had initiated in the 1930’s and 1940’s stressing 
the role of lattices, was developed much further in two important books by Cohn 
and Gratzer. In a parallel development, Lawvere (1965) proposed ‘“The category of 
categories as a foundation for mathematics,’’ beginning with the statement**: 


In the mathematical development of recent decades one sees clearly the rise of the conviction 
that the relevant properties of mathematical objects are those which can be stated in terms of 
their abstract structure rather than in terms of the elements which the objects were thought to 
be made of. The question thus naturally arises whether one can give a foundation for mathematics 


774 GARRETT BIRKHOFF [September 


which expresses wholeheartedly this conviction concerning what mathematics is about, and in 
particular, in which classes and membership in classes do not play any role. Here by “‘foundation”’ 
we mean a single system of first-order axioms in which all usual mathematical objects can be 
defined and all their usual properties proved. A.foundation of the sort we have in mind would 
seemingly be much more natural and readily-usable than the classical one when developing such 
subjects as algebraic topology, functional analysis, model theory of general algebraic systems, 
etc. 


16. The ‘“‘new mathematics” of 1960. In the post-Sputnik era of the early and 
middle 1960’s, enthusiasm went even further. Especially in the United States, a 
vogue developed for exposing school children to formal concepts of set, function 
and axiom often only half-appreciated by their teachers! Its proponents encouraged 
the spread of the myth that these constituted a ‘‘New Mathematics,’’ unknown 
fifty years earlier. One ostensible aim of this vogue was to indoctrinate young people 
so that they could fill a supposed shortage of mathematical teachers and research 
workers. This seemed highly desirable at a time when our postwar “‘baby bulge’’ and 
prosperity was quadrupling of the demand for college teachers of mathematics, while 
an unquestioning faith in the value of basic science was increasing the support for 
research in pure mathematics at a rate of 10-15 per cent annually. But as of 1972, 
it all seems strangely out-of-date! 

To summarize, algebra developed harmoniously during the years 1930-60, with 
its main stream flowing smoothly, swiftly, and finally triumphantly in the channels 
I have described. Some measure of its triumph may be found in the fact that, whereas 
three of the first four Fields medals were awarded in Analysis (in 1936 and 1950), 
three of the four awarded in 1970 were in Algebra. 

However, in the last 5-10 years, powerful new currents have become apparent. 
Some of these have arisen as countercurrents to extremism; thus René Thom has 
recently written a thought-provoking article entitled ‘Modern’ Mathematics: An 
Educational and Philosophical Error,*° in which he urges that geometry should 
replace algebra because ‘‘any question in algebra is either trivial or impossible to 
solve. By contrast, the classic problems of geometry present a wide variety of 
Challenges.’’ 

However, I do not wish to dwell on the exaggerations of a decade which most 
of us recall with nostalgia. Extreme abstraction in research circles, attempts to in- 
culcate premature sophistication in children, and uncritical expansionism in basic 
physical science have provoked reactions which by now threaten to go too far in the 
opposite direction. 

Instead, I wish to describe four positive current trends in algebra which, in my 
opinion, hold great promise for the future. 


FOUR COMPUTER-INFLUENCED CURRENT TRENDS 


17. The new numerical algebra. Already in the 1940’s a new revolution was 
brewing, whose ultimate implications for mathematics are unpredictable. Namely, 


1973] CURRENT TRENDS IN ALGEBRA 775 


the construction of efficient high-speed digital computers was making it feasible 
to solve mathematical problems whose effective solution would have previously been 
prohibitively costly and time-consuming. To many mathematicians, including myself, 
it had become evident by 1950 that the resulting revolution in applied mathematics 
would open up challenging new areas for basic research. In particular, since digital 
computers can only represent real numbers to a finite number of significant digits, 
and can only represent values of real functions at a finite number of points (ap- 
proximate ‘‘nodal values’? at ‘‘mesh-points’’), their use in solving differential 
equations (e.g., from physics or engineering) requires a very careful numerical 
analysis of roundoff and truncation errors.>° 

Thus, to actually solve a system of differential equations (to a desired approxima- 
tion), one usually first replaces it by an approximating system of algebraic equations 
(obtained perhaps by finite difference or finite element methods), whose unknowns 
typically represent nodal values at mesh-points, which is then solved (also approxima- 
tely) on a digital computer. I shall say nothing about this first step of discretization 
here, because the theorems in numerical analysis and approximation theory required 
to justify it belong to classical analysis and not to algebra. Suffice it to say that it 
often leads to very large matrices and associated systems of simultaneous linear 
equations, which may involve 50,000 or more unknowns! The main problem is to 
solve these efficiently. 

These matrices typically have many special properties, which must be exploited 
to achieve efficiency. They are usually very sparse (have mostly zero entries), and 
often symmetric, or symmetrizable by permutations or linear transformations. Their 
diagonal elements may be ‘‘dominant’’ (.e., at least as great as the sum of the absolute 
values of the other entries), and they may have positive diagonal and. negative off- 
diagonal entries. Matrices having all of the above properties are essentially what 
are Called Stieltjes matrices; they arise naturally in network flow problems. 

One usually wants to either: (i) solve the linear system (written symbolically 
Ax = b), or (ii) determine eigenvalues of A (the former are of course the roots of 
| A — Al | = 0). As regards (i), most mathematicians imagined in 1940 that large 
linear systems should be solved (if at all!) by Gaussian elimination, and that the rest 
was sheer drudgery. A few eminent analysts (including Gauss, Jacobi, and von 
Mises) had appreciated the value of iterative methods (also used by Gauss) and had 
studied their rates of convergence, but these methods were (and still are!) totally 
ignored in textbooks on “‘linear algebra.’’? Similar remarks apply to eigenvalue 
problems, where the experience of most mathematicians was limited to 3 x 3 (if not 
to 2 x 2) matrices A= | a;;||, Whose eigenvalues they might have found using 
textbook formulas to solve the cubic characteristic equation 


Ae - (44 + d22+ a33)A? + Ba —-A= 0, 
where 
B = 422033 + 433411 + 441422 — 423032 — 434043 — 04242]. 


776 GARRETT BIRKHOFF [September 


In practice, such textbook methods are extremely inaccurate and inefficient for 
most large matrices*’, and they were replaced in the 1950’s by new algorithms, whose 
invention and analysis created a major new area of “‘classical’’ algebra: the new 
numerical algebra. Excellent surveys of what is now known about this area are 
contained in authoritative books by Varga [17], Wilkinson [19], and Young [20]; 
every forward-looking young algebraist should at least be cognizant of their contents! 


18. Sparse matrices. The past five years have also seen substantial improvements 
(over Gauss) in elimination techniques for solving large systems with sparse coef- 
ficient-matrices. In particular, these have drawn on graph theory for ideas; see [15] 
for a cross-section of current work. 

There are many other interesting new areas of research in (real and complex) 
numerical algebra. I shall just mention three of the most important; references to 
activity in them may be found in many review journals: 

(a) Finding the roots of polynomial equations of degrees up to 100. 

(b) ‘“‘Unconstrained’’ minimization of functions of many variables. 

(c) Linear programming and other techniques for finding minima of functions 
subjected to “‘constraints’’ by equations and inequalities. 

Actually these ‘‘new’’ areas also originated in the 1940’s, if not earlier. Thus by 
1947, linear programming was defined, and the ‘“‘simplex method’’ of solving its 
problems invented by George Dantzig; see p. 20 of G. Hadley’s Linear Program- 
ming (Addison-Wesley, 1962). Moreover, its fundamental techniques were made 
accessible at the college freshman level by Kemeny, Snell, and Thompson 10 years 
later, in their popular Introduction to Finite Mathematics (Prentice-Hall, 1957). 


19. Integer arithmetic. In programming languages for computers, a basic dis- 
tinction is made between exact ‘“‘integer arithmetic’? and approximate ‘‘real 
arithmetic.’’ I have omitted the problems of “‘integer programming” and of solving 
Diophantine equations on computer in the above discussion, because they involve 
integer and not real and complex numerical algebra. Nevertheless, activity in these 
fields represents another strong trend in contemporary numerical algebra. 


20. Theory of automata. Although many mathematicians think of high-speed 
computers as simply ‘‘number-crunchers’’ or supersliderules whose primary mathema- 
tical role is to carry out elaborate numerical computations, and although “‘arith- 
metic units’? may be the most highly organized special pieces of computer “‘hard- 
ware,’” computers are actually much more versatile. Large general purpose computers 
are designed to be universal instruments, capable of expediting all kinds of ‘‘mental”’ 
tasks. Much as the Industrial Revolution was made possible by machines which could 
perform all kinds of ‘‘physical’’? tasks more cheaply and efficiently than human 
beings, the Computer Revolution is aimed at doing the same with mental tasks. This 
prospect makes the study of computers esaecially fascinating. From a mathematical 
standpoint, partly because general purpose computers are digital assemblies of a 


1973] CURRENT TRENDS IN ALGEBRA 777 


finite set of components, their study is based on a new, purely algebraic concept 
which I shall now define axiomatically. 


DEFINITION. A finite state machine (or ‘‘automaton’’) M consists of a collectiom 
A of ‘‘input symbols,’’ a collection S of “‘states,’’ and a collection Z of ‘‘output 
symbols,’’ related by two operations v: A x SS and £:S x Z— Z. The operation 
v assigns to each “‘input symbol’? aeéA and “‘prior state’? se S a ‘“‘new state’’ 
v(a,s)éS; the operation ¢ assigns to a and sa ‘‘printout’’ C(a,s) € Z. More concretely, 
such a finite state machine M can be thought of as evolving from a specified starting 
state Sy, recursively by s, = v(s,—,,4,), and as printing out z, = €(s,_,,a,) for 
k = 1,---,n in succession. In this way, it converts strings of input symbols or pro- 
grams a,,4,°°:,a, into printouts z,,Z5,°°-,Z,,. 

Abstractly, a finite state machine is clearly just a new kind of algebraic system 
M =[A,S,Z; v,€]. If one simplifies M by ignoring Z and € (this is called a ‘‘forgetful 
functor’’ in category theory), the simplified M just describes the action of a free 
semigroup (the set A* of all possible input ‘‘programs’’) on a set (the set S of states). 
The resulting theory of state machines without output fits nicely into axiomatic 
(or ‘‘modern’’) algebra and, as has recently been shown,*? so-called ‘‘universal 
algebra’’ can be applied to it. 


21. Turing machines. Quite similar to finite state machines, but a little more com- 
plicated, are the ‘“Turing machines’’ invented by the logician Turing in 1936, before 
high-speed general purpose digital computers existed. Turing proved that they 
could indeed carry out most processes of mathematical ‘‘thought.’’ Thus they are 
capable of printing out the binary or decimal expansion of any ‘‘definable’’ (alias 
““computable’’) real number, such as e, 2, or the kth zero of the Bessel function J,(x), 
and they can ‘‘deduce all the provable formulas of the restricted Hilbert functional 
calculus,’’ giving all true theorems and no false ones. 

Some two decades after Turing showed that his machines could, in principle, 
carry out the kind of mechanical theorem-proving dreamed of by Leibniz, Whitehead 
and Russell, and Hilbert, Hao Wang did this in practice. Namely, he wrote a special 
program which produced ‘‘proofs’’ in minutes for all the 350 theorems in the predicate 
calculus with equality that were actually stated in Whitehead and Russell’s Principia 


Mathematica!*° 


22. Computational complexity; optimization. A third and very strong trend in 
algebra, and indeed in mathematics generally, is a concern with computational 
complexity and with optimization. In all applications of algebra, of course, the 
efficiency of symbol-manipulation is a prime consideration, but for many years it 
was taboo to discuss it in research journals devoted to pure mathematics. 

This snobbish taboo against discussing efficiency obscured some very important 
basic facts. Thus, in the area of mathematical logic, the scholarly books by 
Whitehead and Russell and the Hilbert school did not seriously try to improve the 


778 GARRETT BIRKHOFF [September 


efficiency of formal deductive schemes, whereas Leibniz and Peano were really trying 
to (and did; especially Leibniz!) develop symbolic techniques. for making mathema- 
tical reasoning more efficient and, therefore, more powerful. This difference becomes 
painfully obvious if one compares the number of symbols required by Whitehead 
and Russell to derive the basic formal properties of sets and relations, with the 
number of words needed by mathematicians to get equally far. So far, it is only in 
the area of the propositional calculus of logic itself, and by using a powerful computer, 
that mechanical theorem-proving has been realized on a substantial scale (by Hao 
Wang, see §21). 

Having finally recognized the importance of efficiency, mathematical logicians 
have begun to analyze the ‘“‘computational complexity’’ of applying general def- 
initions to particular cases. Their analysis has already borne fruit in the development 
of shorter procedures for multiplying numbers and matrices. 

Concern with computational complexity in algebra has as its ultimate goal, of 
course, the optimization of symbolic methods. In turn, the question of optimization 
has already suggested a number of basic problems whose solution should be a 
continuing challenge, stimulating coming generations of pure algebraists. Two of 
these are, respectively: (i) the ‘‘shortest form’’ problem of Boolean algebra, and (ii) 
the “‘most efficient coding”’ problem of information theory. 

Other fascinating optimization problems, concerning which surprising discoveries 
have recently been made, are: (iii) what is the least number of operations on digits 
required to multiply two n-digit integers? (iv) what is the smallest number of arith- 
metic operations required to multiply together two n x n matrices? (v) how can one 
solve n simultaneous linear equations in n unknowns with the fewest additions, 
subtractions, multiplications and divisions? I regret that I do not have time to discuss 
these problems here, and must refer you to the literature ([21] and [11, vol. 2, pp. 
258-78 |). 


23. Combinatorial algebra. A fourth current trend in algebra is towards emphasis 
on combinatorial ideas,*! and especially on those involving graphs or networks. 
This trend is surely due to an intuitive recognition of the fact that digital computers 
and the deductive procedures of mathematics have a structure whose analysis 
requires combinatorial methods. As Hermann Weyl] wrote in 1949: ‘“‘The network 
of nerves joining the brain with the sense organs is a subject that by its very nature 
invites combinatorial investigation. Modern computing machines translate our 
insight into the combinatorial structure of mathematics into practice by mechanical 
and electronic devices,’’+? 

From burgeoning elementary courses in ‘‘Discrete Mathematics’? which are 
intended to precede courses in axiomatic algebra,** probability and statistics, to the 
ambitious 7-volume treatise [11] on The Art of Computer Programming being 
written by Donald Knuth, the new emphasis is the same: permutations, combinations, 


1973] CURRENT TRENDS IN ALGEBRA 779 


partitions, generating functions, trees, sorting, searching, recurrences, and difference 
equations, block designs, and so on. Even a casual reading of the books I have cited 
makes it very clear that the 200 year reign of the Calculus and Analysis has ended — 
and that they will continue to be displaced in our colleges by courses in Algebra in 
the broadest sense of discrete mathematics and the science (no longer just art!) of 
symbol manipulation. 

In a sense, this trend continues the revolution begun by van der Waerden, but 
there has been a major change. No longer do axioms and deductive systems, patterned 
after Euclid’s Elements, seem so fundamental. Neither do groups or rings, with their 
subgroups, subrings and morphisms. Their place is taken by various relational 
structures (including partial orderings and ‘‘complexes’’ in the sense of combinatorial 
topology) which are far less amenable to the general algebraic techniques which 
played such a central role in the ‘“‘modern algebra’’ of 1930-1960. 

Instead, the kinds of algebraic structures (as contrasted with ‘‘relational’’ struc- 
tures) which are most relevant to digital computers and combinatorics are loops, 
monoids and lattices (or groupoids, semigroups and semilattices), which were largely 
ignored by most algebraists in 1930-1960. Loosely speaking, much as groups are 
related to symmetries, so loops are related to designs (or “‘patterns’’), monoids to 
actions (e.g., of input instructions on the states of an automaton), and lattices to 
Structure. 

In particular, Rota** and his associates have shown that lattice theory provides a 
point of vantage from which to attack combinatorial problems in general, and not 
just those of algebra as I had stated in 1933 (see §7). Going even further, N. S. 
Mendelsohn has very recently applied concepts of universal algebra to generate 
combinatorial designs and vice-versa [23, pp. 123-32]. 

One naturally wonders where all these new trends will lead to. I am myself sure 
of only one thing: that they will not make the classical ‘‘modern algebra’’ expounded 
in van der Waerden obsolete, any more than this made real and complex algebra or 
the calculus obsolete. As Knuth emphasizes ([11, vol. 1, p. 1]; see also [1]) the word 
algorithm (or ‘‘algorism’’) which is so central to the mathematics of computation is 
just a corruption of the name Al-Khwarizmi, the originator of the word ‘‘algebra.’’ 

Indeed, the four current trends in algebra which I have been describing were 
merely stimulated by the consideration of digital computers, in much the same way 
that the calculus and analysis were stimulated by thinking about geometry, mechanics 
and mathematical physics. They are simply opening up new areas of mathematics 
for future generations to study, with an ever increasing variety and richness of 
interrelations and applications, in which old and new ideas will mingle and be 
reshaped. Within a few decades, new concepts and trends may well emerge from 
this mingling and reshaping. Certainly, this kind of continuing evolution is the only 
thing that can keep algebra perennially a frésh and exciting subject! 


780 CARRETT BIRKHOFF [September 


Notes 


1 See footnote on page 761. 

2 For this and other facts, I am indebted to Professor David Pingree of Brown University; 
Thomas Hawkins, Gian-Carlo Rota, Gerald Sachs, and John Tate made other very helpful comments. 

3 A notable exception is provided by the binomial theorem, discovered by Pascal in 1653. For 
readable accounts of the facts summarized in this section, see Rouse Ball [1] and E. T. Bell [3]. 

4 Their expositions were very obscure; see G. Birkhoff, Isis 3 (1973), 260-7. That of Galois was 
clarified by Betti in 1852. 

5 We here follow the usual custom of letting Z (for the German “‘Zahl’’ meaning integer) stand 
for the set {0, + 1,+ 2,-:- i. Gauss attributed the consideration of integers mod ” (‘modular num- 
bers”) to Legendre. 

6 For penetrating historical surveys of linear and non-commutative algebra, see N. Bourbaki, 
[6, pp. 78-91 and 120-28]. For a readable summary of Cayley’s contribution, see pp. 102-15 of E. T. 
Bell [2]. 

7 Gergonne’s Annales 5 (1814-15), p. 93; for Hamilton’s ideas, see his Mathematical Papers, 
vol. III, Cambridge Univ. Press, 1967. Leibniz and Cramer had very fragmentary ideas about de- 
terminants; see [1, p. 375] and D. J. Struik, A Source Book in Mathematics, Harvard University 
Press, 1969, p. 180. 

8 R. Woodhouse, Phil. Trans. 91 (1801), 89-119; G. Peacock, Reps. Brit. Assn. Adv. Sci. 3 
(1834), 185-32 and Algebra, 2 vols., 1845; A. de Morgan, Trans. Camb. Phil. Soc., 7 (1839) 173-87 and 
287-300; G. Boole, Cambridge and Dublin Math. J., 3 (1848) 183-98. 

9 F, Klein, Entwicklung der Mathematik im 19ten Jahrhundert, vol. 1, p. 175, characterized this 
as “almost unreadable.” 

10 In his supplements to Dirichlet’s Vorlesungen iiber Zahlentheorie, 1863, 1871. 

11 Benjamin Peirce, Linear Associative Algebra, Boston, 1870; see also Amer. J. Math., 3 
(1880) 15-57, and 4 (1881) 97-229 (reprinted from Proc. Am. Acad. Boston, 1875). 

12 See his Collected Papers. vol. 2, Cremonese, Rome, 1958, p. 134. In the Amer. Math. Society 
Semicentennial Addresses, vol. 2, p. 15, Bell attributed the postulational approach to Peano! Peano 
was also the first to number his theorems. 

13 Volumes 3-6 of the Transactions of the (then young) American Mathematical Society (1902-5) 
contain a dozen articles on postulate systems by the men named above. 

14 Eulcid’s Elements, which included “‘axioms” for magnitudes (algebra) as well as “‘postulates”’ 
for geometry, were written in Alexandria, Egypt, around 300 B. C.; see Ball [1]. 

15 For a historical discussion of ideal theory and Dedekind’s work on algebraic numbers, see 
[3, Ch. 10]. 

16 Op. cit. supra, pp. 216-29. The same result was proved independently by Frobenius, op. cit. 
infra. 

17 G. Frobenius, Crelle, 84 (1878) 1-63, and Berlin Sitzb. (1903) 504-37 and 634-5. Wedder- 
burn’s Lectures on Matrices, Amer. Math. Soc., 1934, contains a complete bibliography to 1933. 

17a See Thomas Hawkins, Archive for History of Exact Sciences, 8 (1972) 243-87. 

18 A related symbolic style of writing was used by E. H. Moore in his Introduction to a Form 
of General Analysis, New Haven Colloquium, Yale Univ. Press, 1910. 

19 See F. Cajori, ‘““Past struggles between symbolists and rhetoricians. ..”, Proc. Int. Math. 
Congress Toronto (1924), vol. 2, pp. 937-41. 

20 First published in 1889 (Arithmetices principia nova methodo exposita). 

21 The fact that this was so had been airily asserted a decade earlier by Russell in his witty Princi- 
ples of Mathematics, of which Principia Mathematica was originally intended to be comprised in a 
second volume! 

22 See G. Birkhoff, ‘“Mathematics and Psychology,” SIAM Review, 11 (1969) 429-69, 


1973] CURRENT TRENDS IN ALGEBRA 781 


23 Werke, vol. 3, p. 153; Math. Annalen 78, 405-15. 

24 Hilbert is here slurring over Euclid’s distinction between ‘‘axioms” (for magnitudes in general) 
and “‘postulates”’ (for geometrical entities). 

25 Of the books [10] and Grundlagen der Mathematik (2 vols., 1939), respectively. 

26 See S. C. Kleene, Introduction to Metamathematics, Van Nostrand, 1932, pp. 204—5. 

27 This MONTHLY, 53 (1946) 1-18. Gédel’s original paper was published in the Monats. Math. 
Phys., 38 (1931) 173-98. 

28 Careful historical reviews of the question touched on here may be found in N. Bourbaki, [6, 
Ch. 1], and (by P. Bernays) in Hilbert’s Werke, vol. 3, pp.. 196-217; this volume also contains Hil- 
bert’s papers on logic. 

29 In German, “der die Algebra ihren erneuten Aufschwung verdankt.” 

30 Proc. Camb. Phil. Soc., 29 (1933) 441. 

31 Bull. Amer. Math. Soc., 44 (1938) 793-827. 

32 A pen-name assumed in 1937 by a group of then young French mathematicians, who wished 
to overthrow the domination of French mathematics by classical analysts. See this MONTHLY, 57 (1950) 
221-32 for authentic statement of Bourbaki’s opinions, including the view that the axiomatic method 
is “‘a standardization of mathematical technique,’ and that the principal mathematical structures 
are those of a group, of order, and of a topological space. 

33 For example, anyone doing serious research on algebraic “geometry”? today is expected to 
consider the two-volume treatise on Commutative Rings by O. Zariski and P. Samuel as standard 
preliminary material, but not to know Newton’s classification of real cubic curves! 

34 F, William Lawvere, “‘The category of categories as a foundation for mathematics,’ Proc. 
Conf. Categorical Algebra, La Jolla, 1965 (S. Eilenberg et al, eds.), Springer, 1966. 

35 American Scientist, Nov.—- Dec., 1971. 

36 Mathematicians habituated to exclusively deductive reasoning should realize that, in practice, 
error analysis relies very heavily on empirical evidence as well as on theoretical principles. 

37 Though not as nearly inefficient as Cramer’s Rule, which is still often the only prescription 
given to students! 

38 See Marvin Minsky, Computation: Finite and Infinite Machines, Prentice-Hall, 1967. 

39 G. Birkhoff and J. D. Lipson, ‘“‘Heterogeneous Algebras,” J. Comb. Analysis, 2 (1969). 

40 H. Wang, IBM J. Res. Develop., 4 (1960) 2-22. For the general question of the computer 
as a “‘brain,”’ see the reference of note 22. 

41 Wallis, Tchirnhaus, and Leibniz all recognized before 1700 that combinatorics belonged to 
algebra. See [13, p. 14] and [21, p. 2]. 

42 E. F. Beckenbach (editor), Applied Combinatorial Mathematics, Wiley, 1964, p. 537. 

43 As currently recommended by the CUPM Panel on the Impact of Computing on Mathematics 
Courses. On an intermediate level, see C. L. Liu, Introduction to Combinatorial Mathematics, 
McGraw-Hill, 1968; on a more advanced level, see M. Hall, Combinatorial Theory, Ginn, 1967. 

44 “On the foundations of combinatorial theory,” J. fiir Wahrsch., 2 (1966) 340-68; Combina- 
torial geometries (preliminary edition), M.I.T. Press, 1970; and refs. given there. 


References 


1. W. W. Rouse Ball, A Short History of Mathematics, 3rd ed., Macmillan, New York, 1901. 

2. Eric T. Bell, Mathematics: Queen and Servant of Sciences, McGraw-Hill, New York, 1951. 

3. , The Development of Mathematics, McGraw-Hill, New York, 1940. 

4. Garrett Birkhoff and Thomas C. Bartee, Modern Applied Algebra, McGraw-Hill, New 
York, 1970. 

5. Garrett Birkhoff and Saunders Mac Lane, A Survey of Modern Algebra, Macmillan, New York, 


1941. 


782 BERND WEGNER [September 


6. Nicolas Bourbaki, Eléments d’Histoire des Mathématiques, Hermann, Paris, 1960. 

7. Florian Cajori, A History of Mathematical Notations, 2 vols., Open Court, Chicago, 1928-9. 

8. George Gratzer, Universal Algebra, Van Nostrand, Princeton, N. J., 1968. 

9. David Hilbert, Grundlagen der Geometrie, 1899; 2nd. ed., 1901. Authorized translation by 
E. J. Townsend, Open Court, Chicago, 1902, 1910. 

10. David Hilbert and W. Ackermann, Grundziige der theoretische Logik, 4th ed., 1949. 

11. Donald Knuth, Algorithms, 7 projected volumes, Addison-Wesley, Reading, Mass., 1969. 

12. S. Mac Lane and G. Birkhoff, Algebra, Macmillan, New York, 1967. 

13. Uta Merzbach, “... Development of Modern Algebraic Structures from Leibniz to Dedekind,” 
Ph. D. Thesis, Harvard, 1965. 

14. Giuseppe Peano, Formulario Matematico, 4th ed., Torino, 1908. 

15. Donald Rose and Ralph Willoughby (eds.), Sparse Matrices and their Applications, Plenum 
Press, New York, 1971. 

16. B. L. van der Waerden, Moderne Algebra, 2 vols., Springer, New York, 1930-31. 

17. Richard S. Varga, Matrix Iterative Analysis, Prentice-Hall, Englewood Cliffs, N.J., 1962. 

18. Alfred N. Whitehead and Bertrand Russell, Principia Mathematica, 3 vols., Cambridge 
Univ. Press, 1911. 

19. James Wilkinson, The Algebraic Eigenvalue Problem, Clarendon Press, Oxford, 1966. 

20. David M. Young, Iterative Solution of Large Linear Systems, Academic Press, New York, 
1971. 

21. Garrett Birkhoff and Marshall Hall (eds.), Computers in Algebra and Number Theory, 
SIAM-AMS Proceedings, vol. IV, Amer. Math. Society, 1971. 

22. Deane Montgomery and Leo Zippin, Topological Transformation Groups, Wiley-Interscience, 
New York, 1955. 

23. W. Tutte (ed.), Recent Progress in Combinatorics, Academic Press, New York, 1969. 


MATHEMATICAL NOTES 
EDITED BY ROBERT GILMER 


Material for this Department should be sent to David Roselle, Department of Mathematics, 
Louisiana State University, Baton Rouge, LA 70803. 


EXISTENCE OF FOUR CONCURRENT NORMALS TO A SMOOTH 
CLOSED HYPERSURFACE OF E" 


BERND WEGNER, Technische Universitat Berlin 


A four-normals-point of a smooth closed hypersurface of E” (n-dimensional 
euclidean space) is a point where at least four normals of the hypersurface intersect. 
There are several examples of four-normals-points of a closed curve in the plane 
(compare Chakerian and Stein [2], Deo and Klamkin [3], Guggenheimer [4]). 
The purpose of this note is to give many examples of four-normals-points of a smooth 
closed hypersurface F of E”. We shall prove that every neighborhood of a focal 
point of F contains a four-normals-point of F if F is an immersed (n—1)-sphere. 


1973] MATHEMATICAL NOTES 783 


The proof uses elementary Morse theory which can be found in Milnor [5] or Cairns 
and Morse [1]. 

First there are given some basic results and definitions from elementary Morse 
theory. Let F be a closed C? hypersurface of E” which is an immersed (n — 1)-sphere. 
Let L, denote the distance function with center pe E”, i.e., L(x) = |x — p|?. 
The point x € F is called critical point of L, if the differential (L,)¥ of L, at x vanishes; 
the critical point x of L, is called nondegenerate (resp. degenerate) if the matrix 


(a; |.) 


Ou; Ou j 
is nonsingular (resp. singular) where u,,---,u,,_, are local coordinates of F around x; 
the index of a nondegenerate critical point x of L, is defined to be the maximal 
dimension of a subspace of the tangent plane of F at x where the bilinear functional 
represented by 
/) 


is negative definite. A necessary and sufficient condition for x to be a critical point 
of L, is that the normal line of F through x meets p. A point gq € E” is called focal 
point of F with base x e F and multiplicity p if q = x + vis the image of the point 
(x,v) of the normal bundle of F under the end point map E and the differential 
Ef.» of E at (x,v) has rank n— <n where E is defined by E(x,v) =x +0. 

Useful characterizations for q to be a focal point of F with base x are the follow- 
ing two (see Milnor [5], p. 32 ff): (1) x is a degenerate critical point of L,, (2) || q—~ | 
is a principal radius of curvature of F in the direction (q—x)/|| q—x ||. Thus the set 
of focal points of a plane curve is exactly the evolute of the curve.. 

The second characterization also implies that on the normal line of F through x 
there is only a finite number of focal points with base x (at most n—1). The Morse 
index theorem for the distance function (see [5], p. 37) states that the index of xe F 
as a nondegenerate critical point of L, is exactly the number of focal points of F 
with base x situated on the segment from x to p, each being counted with its multi- 
plicity. 


(ae 
Ou; Ou, 


THEOREM. If q is a focal point of F, then every neighborhood of q contains 
a four-normals-point of F. 


Proof. Let q be a focal point of F with base x. Let N denote the unit normal 
of F in x such that q = x + rN for some r > 0. Furthermore let m (resp. M) be the 
square root of the absolute minimum (resp. maximum) of L, on F. If m=r=M, 
then F is a sphere and the statement is trivial. In the case m <r take 


gq’ =x+(r—s)N 


where 4(r—m) > e > 0 such that q’ is not a focal point of F with base x. Using 


784 BERND WEGNER [September 


the triangle inequality we see that L,-(x) is not an absolute minimum of L,. Apply- 
ing the Morse index theorem for the distance function, we conclude that the index 
of x as a nondegenerate critical point of L,- is less than n—1 because gq is not con- 
tained in the segment from x to q’. 

As we Shall see below, the point q’ would suffice for a four-normals-point if 
L,, were nondegenerate (i.e., all critical points of L, are nondegenerate). This is 
not necessarily the case. Therefore, we have to look for a nondegenerate L,. with 
q” in a sufficiently small neighborhood of q’ such that the nice properties of L,, 
are valid for L,,. We shall get the existence of such nondegenerate functions as a 
consequence of the following statement: There exists a real number 7 >0 such 
that for any point q” in the y-neighborhood of q’, L,, has a critical point where the 
value of L,, is neither an absolute maximum nor an absolute minimum. To see the 
last statement, we regard a connected open neighborhood U of (x,q’— x) in the 
normal bundle of F such that the restriction E|U of the end point map to U is a 
diffeomorphism onto an open subset of E”. The existence of U follows from the 
Inverse Function Theorem because q’ is not a focal point of F with base x and 
therefore Ef,,q’-x) has rank n. Thus for every pair (z,v)€U we have that z is a 
nondegenerate critical point of L,,,,. Choosing the y-neighborhood of q’ sufficiently 
small within E(U) such that the values of L, at x and L,,, at z do not differ too 
much for (z,v)eéU and z+ v in this neighborhood we get the statement above by 
a simple geometric argument. 

Now we continue the main proof. The set of focal points of F is nowhere dense 
in E" (compare Cairns and Morse [1], Theorem 15.3). Thus, there exists a non- 
degenerate L, with p as near to q’ as we please (and hence to q), having a critical 
point where the value of L, is neither an absolute maximum nor an absolute mini- 
mum. Thus by the compactness of F L, has at least three critical points. If C; denotes 
the number of critical points of index i of L,, we have the equality 

"7 9(—1)'C; = x (see [5], p. 28) where y is the Euler characteristic of F which 
is even in our special case. This implies D?~5(—1)'C; =0 mod 2, and therefore 
there must be a fourth critical point of L,. Hence p must be a four-normals-point 
of F. In the case r<M replace the inequality given above by 4(r—M)<e<0 
and minimum by maximum. 


COROLLARY. For a closed C? hypersurface H of E" (which may have self-inter- 
sections) which has an even Euler characteristic there exists at least one four-normals- 
point. 


Proof. If H is an immersed (n—1)-sphere, then this corollary is a direct con- 
sequence of the theorem above because H must have a focal point. If H is not an 
immersed (n — 1)-sphere, then every nondegenerate L, must have more than two 
critical points (compare Milnor [5], p. 25) and therefore at least four because x 
is even. This gives the proof of the corollary because L, is nondegenerate for almost 


1973] MATHEMATICAL NOTES 785 


all pe E”. The last case is also contained in the normal arc theorem of Cairns and 
Morse ([{1], p. 281). 

We conclude with two remarks. First, we note that the proofs above do not 
depend on the codimension one and therefore, the same statements are valid for 
any other codimension. 

We may also state that the (n—1)-sphere is the only hypersurface H of E” with 
even Euler characteristic which has only one four-normals-point. This remark is 
seen as follows: By the proof of the corollary, H must be an immersed (n—1)-sphere. 
Then the theorem above implies that H has only one focal point, i.e., H is the rigid 
(n—1)-sphere in E’. 


References 


1. S.S. Cairns and M. Morse, Critical Point Theory in Global Analysis and Differential Topo- 
logy, Academic Press, New York, 1970. 

2. G. D. Chakerian and S. K. Stein, On the centroid of a homogeneous wire, Mich. Math. J., 
11 (1964) 189-192. 

3. N. Deo and M. S. Klamkin, Existence of four concurrent normals to a smooth closed curve, 
this MONTHLY, 77 (1970) 1083-1084. 

4. H. H. Guggenheimer, Geometrical applications of integral calculus, p. 84 (contained in K. O. 
May, Lectures on Calculus, Holden-Day, San Francisco, 1967). 

5. J. Milnor, Morse Theory, Princeton University Press, Princeton, 1963. 


ON A PROBLEM OF BESICOVITCH 
B. FisHer, Leicester 


In 1917, S. Kakeya [2] raised the following problem: 


A line segment AB lying ina plane is to be moved in the plane so that it returns 
to its original position but with its direction reversed. How should this be done 
in order that the area swept out by the segment during motion may be a minimum? 


In 1928 A. S. Besicovitch [1] gave an answer to this problem, showing that the 
area swept out could be made arbitrarily small. He showed that this result was a 
consequence of the following theorem which he had established [3]: 


THEOREM. Let ABC be any triangle. Divide the base AB into 2" equal parts 
and join the points of division to the vertex C, dividing the triangle into 2" elemen- 
tary triangles. With suitable choice of n it is then possible to translate the ele- 
mentary triangles along AB in such a way that the area covered by them in their 
new positions is arbitrarily small. 


The original proof of this theorem by Besicovitch was rather difficult and I give 
here a more elementary proof. 


786 B. FISHER [September 


Proof of the theorem: Suppose the area of triangle ABC is S and the length of 
AB is c. Divide AB into 2” equal parts and label the points of division Y, for i = 0, 
1,2,---,2", the particular points given by i = 0 and 2” being A and B respectively. 
Translate each triangle Y,,Y,;4,C for i =0,1,2,---,2" '—1 a distance cx/2" 
along AB, where 0 <x <1, to form the triangle X,,X.,;,,C’ in its new position. 
The triangle X,;X.;4,C’ overlaps the triangle Y,;4,Y5;4,C, the sides X,,C’ and 
Y,;+2C intersecting in a point Z;, to form a triangle X,,Y;,,Z, together with two 
other triangles outside triangle X>;Y>;4Z;- 

It follows by elementary geometry that the area covered by the two overlapping 
triangles X5;X;4,C’ and Y2;41Y234.C is now 


=U + (1—x)? + 2x?], 


the area of triangle X,,Y,;,,Z, being 


S 
ant 


1 — x — 4x?) <xar(l — x) 
and the area outside this triangle being Sx?/2"-?. Since Y,,Z;_, is parallel to X ,;Z; 
for i = 1,2,---,2"-1 — 1 it follows that the 2"-! triangles X.,Y,;4 Z; can be trans- 
lated along AB to form a triangle A,B,C, with each point Z, moving to the vertex 
C,. The area S, of triangle A,B,C, is less than S(1— x) and the area covered by 
the original 2" elementary triangles outside triangle A,B,C, is less than 2Sx?. 

The above process is now repeated on the triangle A,B,C, which is made up of 
2”~* elementary triangles, to form a triangle A,B,C, with area S,. We have 


S> < S,d—x) < S(i—x)? 
and the area covered by the original 2” elementary triangles outside A,B,C, is less 
than 
2Sx* + 28S,x? < 2Sx* + 2Sx*(1 — x). 
Further repetitions of this process give us a triangle A,B,C, with area S,. We 
have | 
S, < S,-,(1—x) < S@—x)’ 


and the area covered by the original 2” elementary triangles outside triangle A,B,C, 
is less than 


2Sx7[1 + (1—x) ++ +(1—x)"*] < 2Sx. 
Now, for arbitrary e>0, put x = e/2S. The area covered by the original 2” 


elementary triangles outside triangle A,B,C, is then less than ¢. Further the area 
of triangle A,B,C, is less than S(1 — ¢/2S)’, which is less than e for r greater than 


1973] MATHEMATICAL NOTES 787 


some N. Thus if n >r> WN the area covered by the 2” elementary triangles is now 
less than 2¢, completing the proof of the theorem. 

To see how Kakeya’s problem now follows from this theorem the reader is re- 
commended to read Besicovitch’s original paper. 


References 


. Besicovitch, On Kakeya’s problem and a similar one, Math. Zeit., 27 (1928) 312-320. 
akeya, T6hoku Science Reports, 6 (1917) 71-78. 

. Besicovitch, Sur deux questions de l’intégrabilité, Journal de la Société des Math. et de 
niv. 4 Perm, 2 (1920) 105-123. 


Cu Ry 


TOPOLOGICAL PROPERTIES OF THE ROW ECHELON FORM 
G. P. BARKER, University of Missouri at Kansas City 


Motivated by [1] we investigate the topological nature of the set of matrices in 
row echelon form. We can define an equivalence relation on this set which involves 
only the elements of the matrices. This equivalence relation seems rather natural, 
and it is pleasing to find that it has topological significance. 

Throughout this note F denotes either the real or the complex numbers, and M 
denotes the set of m x n matrices with elements from F. The usual topology on M 
can be generated by the norm (A) = max | a;, | for AeM. Next we recall two 
definitions [2, p. 44]. 

Let Ae M. The leading entry of a row of A is the first nonzero entry of that row. 
Denote by I(i; A) the index of the column of A which contains the leading entry of 
row i if that row is nonzero. Otherwise, set l(i; A)=n+1. The matrix A is in 
row echelon form if 

(1) the nonzero rows are at the top of the matrix; 

(2) 11; A) < 1(2; A) <--- < I(r; A), where r = rank A; 

(3) all leading entries are 1; 

(4) j =1(i; A) implies a;,; = 1 and a,; = 0 for k ¥ i. 

Let R denote the set of m x n matrices in row echelon form with elements from F. 
The topology on R is the subspace topology induced by M. If I(i; A) is the column 
index for the leading entry of row i of A and I(i; B)is the same for B, then A and B 
are pattern equivalent if I(i; A) = I(i; B) for i=1,2,---,m. This relation is denoted 
by A ~ B, and it is easily seen to be an equivalence relation. 

The norm p restricted to R determines an open base of the neighborhoods o1 
any point Ae R. If A and B are elements of R such that u(A — B) <é for ¢ < t, say, 
then A and B are pattern equivalent. In fact, if 


S(A) = {Be R| (A — B) <8}, 


where ¢ is small, say ¢ < 4, and if0 S a <1, then for any C, and C, in S(A) we have 


788 G. P. BARKER [September 
aC, +(1 —«)C,€S(A). 
We can collect these observations as a formal result. 


PROPOSITION. R is locally convex. 


Consequently, R is locally connected so that the components of R are both open 
and closed (see [3], p. 72). 


In analogy with [1] we shall write A~ B if A and B lie in the same arcwise 
connected component of R, and A ~ B if A and B lie in the same component of R. 
Both ~ and ~ are equivalence relations. The equivalence class of A with respect to 
X, ~, and ~ will be denoted by P(A), A(A), and C(A) respectively. 


THEOREM. The three relations ~, ~, and ~ are equivalent; that is, for each 
AéR we have P(A) = C(A) = A(A). 


Proof. We first show that P(A) = A(A). 


If Be P(A), then f(a) = (1 — «) A+ @B is an arc from A to B which lies entirely 
within P(A). Hence P(A) < A(A). Conversely suppose 


f() = _ s WO) 
Fin(t) °° Sinn(T) 

is a continuous function from [0,1] to A(A) with f(0)=A and f(1i)=B. If 
B ¢ P(A), then for soime i necessarily I(i; B) # I(i; A). We may assume I(i; B) < I(i; A) 
so that /,;(0) = 0 for j = 1,2,---, (i; A) — 1. Each f,,(t) is a continuous function and 
for | = I(i; B) we have f,,(0) = 0 and f;,(1) = 1. Consequently we can find a t,. with 
0<t) <1 anda k satisfying I(i; B) S k < I(i; A) such that for all sufficiently small 
é > 0 we have f;,(t) + &) # 0 while /,,(t) = 0 for all t S Tt, and all j = 1, 2,---, I(i; A) 
— 1. Note that t) =1 is not possible since that would mean that the continuous 
function f;,(t) would map the interval [0, t)] onto the set {0,1}. But now we see that 
for e > 0 sufficiently small f;,(to + &) # 1 so that f,,(to + &) =1 for some p<k as 
é|0. This contradicts the fact that /;,(t>o) =0. Hence Be P(A). 

We finish the proof by showing that the arc components are both open and 
closed. If A, € A(A) and B = lim A,, then each A, is pattern equivalent to A. However, 
the convergence is coordinatewise so that B must be pattern equivalent to A. Thus 
Be A(A), and A(A) is closed. On the other hand if Be A(A), and as before, if 


S(B) = {CeR|u (B—- C) <3}, 


where e¢ <i, say, then S(B) is a neighborhood of B which is open in R. It is also 
clear that S(B) < P(A) = A(A). Hence A(A) is also open, and so A(A) = C(A). 


Acknowledgement. The author would like to thank the referee for several helpful suggestions. 


1973] MATHEMATICAL NOTES 789 


References 


1. H. Schneider, Topological aspects of Sylvester’s theorem on the inertia of Hermitian matrices, 
this MONTHLY, 73 (1966) 817-821. 
2. H. Schneider and G. P. Barker, Matrices and Linear Algebra, Holt, Rinehart and Winston, 


New York, 1968. 
3. A. Wilansky, Topology for Analysis, Ginn, Waltham, Mass., 1970. 


TWO-DIMENSIONAL COMPLETE MONOTONICITY WITH DIAGONALIZATION 
C. H. KIMBERLING, University of Evansville 


1. Introduction. A continuous function from [0,00) into [0,00) is completely 
monotone if its derivatives alternate in sign: (— 1)"f(x) =>0 for n=0,1,2,--- and 
all x in (0, 00). Correspondingly, a sequence Jo, [,, U2, °°: of nonnegative real numbers 
is a completely monotone sequence if for each n, all the differences 


A‘u, = Ly, —~ Pn+i1 
A*H, 


Ly 2Uy+1 + Hy+2 


| k k 
A‘, = B, — ( 1) + (5) Hy+2 —*"° + (- 1) “Un +e 


are non-negative. Our symbols A“u, just defined follow [2], but not [3]. 

We shall consider two types of infinite matrices and associated two-place functions. 
In the first type of matrix, each row and each column forms a completely monotone 
sequence. The sequence of diagonal elements of such a matrix need not be completely 
monotone, but additional monotonicity conditions ensure a completely monotone 
diagonal. Analogous conditions on a two-place function f(x, y) imply complete 
monotonicity of the one-place function f(x, x). 

The second type of matrix arises from the derivatives of a given completely 
monotone function f on [0, 00) as follows: the nth row is (— 1)"f™(k), k = 0, 1,2, +. 
The diagonal (— 1)"f(n) is then a completely monotone sequence which extends to 
a completely monotone function. 


2. Two-dimensional complete monotonicity. We call (u;;) a completely monotone 
matrix if 


(1) A A™ u,, 20; npm =0,1,2,--; i,j =0,1,2,-~. 


12 
For example, if (u;) and (v,) are completely monotone sequences, then (,v,) is a 
completely monotone matrix. Also (yj, ,) is a completely monotone matrix. 

We call (,;) a placewise completely monotone matrix if each of its rows and 
columns is a completely monotone sequence. Clearly a completely monotone matrix 


790 C. H. KIMBERLING [September 


is a placewise completely monotone matrix. The converse fails, for example, when 
;; =(i + 1)-7~*. Other examples arise from any given positive constants a and b 
and completely monotone f on [0, 00) by setting 


Mi; =fLi+ Ga —j + Vib). 


THEOREM 1. {f (u;;) is a completely monotone matrix, then (u,,) is a completely 
monotone sequence. 


Proof: 
k k . ; 
A" yi = x ( . JALAL peas 2 05 k =0,1,2,---. 
j=0 \J 
Turning now toward the analogous theorem for two-place functions, we begin 
with a theorem [2, p. 9] which generalizes the classical Bernstein representation of 
completely monotone sequences by Riemann-Stieltjes integration with respect to a 


bounded nondecreasing pai The generalized theorem states that a necessary 
and sufficient condition for (1 to/hold is that there exist a two-place distribution 


function ® satisfying 
1 1 _ 
(2) hy = [ [ u'v'd®; i,j =0,1,2,---. 
0) 0) 

By a two-place completely monotone function we mean a function f on [0, 00)’, 
all of whose partial derivatives of all orders exist and satisfy 
(3) (— 1)"""D} 2(f) 2 9; n,m =0,1,2,-:-. 

Analogous to the representation (2), we have (3) if and only if there exists a two- 


place distribution function ® satisfying 


(4) f(x,y) = { [ wore; (x, y) €[0, 00)’. 


THEOREM 2. If f(x, y) is a two-place completely monotone function on [0, 00)’, 
then the function h(x) = f(x, x) is completely monotone on [0, 0). 


Proof: From (4), we have 
1 1 
DIDE fy) = | | wdoguyo"doguy 440; j= 0,1, 
0) 0) 


The integral is clearly nonnegative for even k and nonpositive for odd k. Thus, from 
the identity 


k 
HO) = & (7) DDE VE, Manny 


j=0 


we conclude that (— 1)‘h 20; k =0,1,2,---. 


1973] RESEARCH PROBLEMS 791 


3. Derivative matrices. Suppose f is completely monotone on [0, 00). The rows 
of the matrix (u,;;) =((— Df) are then completely monotone sequences [3, p. 
164], but the columns generally are not completely monotone sequences. 


THEOREM 3. If fis completely monotone on [0, 00), then the sequence ((—1)"f™(n)) 
is completely monotone. 


Proof. Writing f(x) = J } t* da(t) (by Bernstein’s theorem) and yp, = (— 1y"'f™(n), 
we have for k = 0,1,2,---, 


k 
AY, = (-1I" 5 LOMO aD 


= (—1)" [ oe t)"(1 + tlog t)*da(t) = 0. 


Example. Starting with a(t)=t, we have in Theorem 3 the function f(x) 
= 1/(x +1). By Theorem 3, the sequence (n + 1)~*n! is completely monotone. 
One may conjecture and easily verify that the corresponding function x~*I(x) is 
also completely monotone. 


References 


1. R. P. Boas, Signs of Derivatives and Analytic Behavior, this MONTHLY, 78 (1971) 1085-1093. 
2. J. A. Shohat and J. D. Tamarkin, The Problem of Moments, American Mathematical Society, 


Providence, 1950. 
3. D. V. Widder, The Laplace Transform, Princeton University Press, Princeton, 1946. 


RESEARCH PROBLEMS 
EDITED BY RICHARD GUY 
In this Department the Monthly presents easily stated research problems dealing with notions 
ordinarily encountered in undergraduate mathematics. Each problem should be accompanied by 
relevant references (if any are known to the author) and by a brief description of known partial 


results. Manuscripts should be sent to Richard Guy, Department of Mathematics, Statistics, and 
Computing Science, The University of Calgary, Calgary 44, Alberta, Canada. 


THE PERMANENT OF A DOUBLY STOCHASTIC MATRIX 
RUSSELL MERRIS, California State University, Hayward 


I. Statement of the problem. If A = (a,,;) is an n-square matrix, the permanent 
of A is the scalar-valued function of A defined by 


per (A) = 414,424 "°° Anis 


792 RUSSELL MERRIS [September 


where the summation extends over all permutations i,, i,,---, i, of the integers 
1,2,---,n. Loosely speaking, the permanent is the determinant without the alternating 
minus signs. A good introduction to the permanent can be found in [8] or [12]. 

A matrix with nonnegative entries, and whose row and column sums are all equal 
to one, is called doubly stochastic. B. L. van der Waerden has conjectured [13] that 
per A 2n!/n" for all doubly stochastic n-square matrices A, with equality holding 
if and only if A = J,, the matrix each of whose entries is 1/n. 

Results of Marcus and Newman [10] have led to the following conjecture [4] 
which is stronger than van der Waerden’s: 


(1) n*per(A) = E per(A,,;) 
ij=1 


for all doubly stochastic A, where A;,; is the submatrix of A obtained by deleting 
row i and column j. (Inequality (1) has been demonstrated for A positive semi- 
definite symmetric in [6] and [9].) 

Consider now the permanental adjoint of the matrix A. It is the n-square matrix 
whose i,j entry is per(A,,). Then (1) asserts that, on the average, per(A) dominates 
the elements of the permanental adjoint of A. 

If one could prove that n per(A) dominated the maximum row sum of the per- 


manental adjoint of A, then of course (1) would follow. Unfortunately, this is not 
the case. Let 


-—O 7 5 
A= 1 7 O 5 
12 
5 5 22 
The permanental adjoint of A is 
—25 39 35 
1 

P= {44 39 25 35 

35 35 49_ 


The third row sum of P is 119/144. Thrice the permanent of A is 112/144. 
Well, what about the minimum row sum? A prima facie weaker statement than 


(1) is that n per(A) dominates the minimum row sum of the permanental adjoint 
of A, i.e., 


(2) nper(A) 2 min yr per(A,,). 
1 jet 


t 


This is the problem which I wish to propose: Can (2) be demonstrated for all doubly 
stochastic n-square matrices A? 


1973] RESEARCH PROBLEMS 793 


If. Equivalent statement of the problem. Let M =(m,,) be any n-square matrix. 
Denote by r; (M) the ith row sum of M, and by r(M) the sum of the elements of M. 
If A is any doubly stochastic matrix, then r{(M) = r{ MA), for 


Cg i i n n 
r(MA) = 2 2 Mina = 2 Mik Ra Ay = 2 Mix = r(M). 


jJ=1 


If P is the permanental adjoint of A, what does PA look like? By the Laplace 
expansion theorem (for permanents [7, p. 20]), the i, i element of PA is per (A), for 
all i. But, whereas the product of A with its determinantal adjoint is det(A)I, the 
off diagonal terms of PA are not zero. However, by our little argument above, 
conjecture (1) is equivalent to the statement that, on the average, per(A) dominates 
the off diagonal elements of PA. More precisely, n* per(A) = r(PA). Similarly, an 
equivalent formulation of (2) is 


(3) nper(A) = min r{PA). 


Til. A method of attack. It is known that if N is a square matrix with non- 
negative entries, then the maximum eigenvalue of N dominates the minimum row 
sum of N [5, pp. 63 and 68]. Therefore, (2) would be established if one could prove 
that n per(A) dominated the maximum eigenvalue of P. Alternatively, (3) would be 
proved if one could show that n per (A) (= trace PA) dominated the maximum 


eigenvalue of PA. 


-References 


1. R. A. Brualdi, Permanent of the product of doubly stochastic matrices, Proc. Cambridge 
Philos. Soc., 62 (1966) 643-648, Lemma 1. 

2. R. A. Brualdi and M. Newman, Inequalities for permanents and permanental minors, Proc. 
Cambridge Philos. Soc., 61 (1965) 741-746. 

3. , Inequalities for the permanental minors of nonnegative matrices, Canad. J. Math., 
18 (1966) 608-615. 

4, D. Z. Dokovic, On a conjecture by van der Waerden, Mat. Vesnik, 4 (19) (1967) 272-276. 

5. F. R. Gantmacher, The theory of matrices, vol. 2, Chelsea, New York, 1960. 

6. Marvin Marcus and Russell Merris, A relation between the permanental and determinantal 
adjoints, J. Australian Math. Soc. 

7. Marvin Marcus and Henryk Minc, A survey of matrix theory and matrix inequalities, Allyn 
and Bacon, Boston, 1964. 

8. , Permanents, this MONTHLY, 72 (1965) 577-591. 

9, , Extensions of classical matrix inequalities, Linear Algebra and Appl., 1 (1968) 421-444. 

10. Marvin Marcus and Morris Newman, On the minimum of the permanent of a doubly stocha- 
stic matrix, Duke Math. J., 26 (1959) 61-72. 

11. Henryk Minc, On lower bounds for permanents of (0,1) matrices, Proc. Amer. Math. Soc., 
22 (1969) 117-123, Lemma 1. 

12. Herbert John Ryser, Combinatorial mathematics, Carus Monograph, No. 14, MAA, 


1963. 
13. B. L. van der Waerden, Aufgabe 56, Jber. Deutsch. Math.-Verein, 35 (1926) 117. 


CLASSROOM NOTES 
EDITED BY ROBERT GILMER 


Material for this Department should be sent to David Roselle, Department of Mathematics, 
Louisiana State University, Baton Rouge, LA 70803. 


A GENERALIZATION OF A THEOREM OF ARCHIMEDES 
WALTER RUuDIN, University of Wisconsin 


THEOREM A. If two parallel planes whose distance is d intersect a sphere S 
of radius r, then the area of the part of S that lies between the two planes is 2nrd. 


This is the theorem to which the title of this note alludes. (See p. 145 of C. B. 
Boyer’s A History of Mathematics, Wiley, 1968.) 

If the euclidean space R° is equipped with its usual coordinate system, so that 
the unit sphere S* is the set of all x = (x,,X2,X3) with x7 +x} +x3 = 1, then 
Theorem A is seen to be equivalent to 


THEOREM A’. If 0 <6 <1 and if E(6) is the set of all xe S? with |x3| S$ 6, 
then the area of E(0) is 4x0. 


This formulation of Archimedes’ theorem leads to a problem which may interest 
a good Calculus class when multiple integrals are studied. 

Fix integers n and kj 1 Sk Sn—1. Write each x = (x,,:::,x,)¢R" in the 
form 


(1) X = (x',x"), 

where x! = (X1,°°,X,)€R% X” = (Xp415°°'X,) ER” “. Define 

(2) x" = rt tx x = Oe to tn)? 

The unit sphere S”~* in R” is then the set of all x eR" with || x’ ||? + |] x” |]? = 1. 
For 0 S$ 6 S 1, let E(6d) be the set of all xe S"-! for which || x”|| < 6. 


QUESTION. For which pairs (n,k) is the (n—1)-dimensional volume of E(6) 
proportional to 5" *? 


Theorem A’ asserts that (3,2) is such a pair. The answer to the question has a 
feature which may be surprising: It depends only on k. 


THEOREM 1. The (n—1)-dimensional volume of E(5) is proportional to 5"~* if 
and only if k = 2. 


The computation which proves Theorem 1 is made easy by Theorem 2 which 
will now be formulated. It reduces the computation of certain n-dimensional volumes 
to the evaluation of integrals over plane regions. 


794 


CLASSROOM NOTES 795 


Suppose 1 < k S$ n—1, as above. Let Q be the closed first quadrant in R?; 
explicitly, Q is the set of all (€,4) with € 2 0, 7 = 0. Let @ be the mapping of R” 
onto Q defined by 


(3) P(x) = $(x', x”) = (| x'f, |x” )- 


To see what @ does, observe that if g = (€,4)€Q then @~ ‘(q) is the cartesian 
product of a (k—1)-dimensional] sphere and an (n—k—1)-dimensional one, unless 
é = 0 or ny = 0; when € = 0 <n, d~'(q) is a sphere of dimension n—k—1; when 
E>0 =n, then $~‘(q) is a sphere of dimension k—1; when € = 0 =n, 67'(q) 
is a point. 

Let V,, be the n-dimensional volume of the unit ballin R". For example, V, = 2, 
V, = 1. Let o,_, be the (n—1)-dimensional volume of S"-*. Thus og = 2, 6, = 22, 
o, = 4n,---. (The other values of V, and o,_, are in (14) and (15).) 

In general, let m,(A) denote the n-dimensional volume (or measure) of the set A. 


THEOREM 2. If Q is a region in Q and if A=@~1(Q) is the set of all 
x = (x’,x”)eE R" with (| x’ |. | x"1)EQ, then 


(4) m,(A) = k(n—kV;V, 5 | [, eA N dy 


Proof of Theorem 2. Let Q first be a rectangle, given by OS a<é<b, 
O0O<a<n<f8B, Then A=A’x A", where A’ is the set of all x’eR* with 
a<||x'|| <b, and A” is the set of all x”eR"-* with a <||x”| <f. Hence 


(5) m,(A) = m,(A')m,_,(A") = (b§ — a)V,° (BY — a “V4. 


On the other hand, 
_ oe b* _ ak pr-* _ gk 
k-1, n-k-1 _ ; 

6) [[ etre dea = 


Comparison of (5) and (6) shows that (4) holds for these rectangles. Hence (4) holds 
for general Q, by any of the standard approximation procedures. (In fact, the col- 
lection of all sets Q for which (4) holds is easily seen to be a o-algebra, so that (4) 
holds for every Borel set Q; it also holds for every Lebesgue-measurable Q.) 


Proof of Theorem 1. Let C(6) be the cone with base E(6) and vertex at the origin. 
In other words, C(6) is the union of all intervals in R" which have one endpoint 
at the origin and the other in E. Or, C(6) = {tx:0 St ¢$ 1, xe E(6)}. For r>0, 
note that 


(7) m,(rC(0)) = m,(C(0)), 


and that m,,_ ,(E(6)) is the derivative of the left side of (7), evaluated at r = 1. Since 
the derivative of r” is n when r = 1, (7) implies that 


796 WALTER RUDIN 


(8) m,, - (E(6)) 
The special case 6 = 1 yields 


nm,(C(60)). 


(9) O,-1 = nV, . 
Note also that C(6) = @~‘(Q), where Q consists of all (€,y)¢Q that satisfy 
(10) c* +n° <1 and n S (tana) - é; 


here « is chosen so that sina = 6, OS aS 27/2. 
By switching to polar coordinates and then setting t = sin@, it now follows 
from (8), (9), and Theorem 2 that 


my -(E(5)) = 16 ,=1Fp-4~1 [ [. ek Ly k- de dy 


1 4 
(11) er | pk~Typn-k- ld { (cos 6)*~ ‘(sin 6)"~*- 1dé 
0 0 


é 
0.1% n—-K—1 {a — {7)%—2)/2yn-k— Tae 
0 


It is now clear that m,_ ,(E(6)) is proportional to 6"~* if and only if the last integrand 
is proportional to t"-*-!, and this happens if and only if k = 2. 
Theorem 1 is thus proved. 


REMARKS. When k = 2, (11) and (9) yield 


n-2 


5 2nV,-26"~* 


(12) m,-1(B(6)) = 010-2" 


which reduces to Theorem A’ when n = 3. If 6 = 1, (12) and (9) give the recursion 
formula 


(13) nV, = 2nV,_>. 

Since V, = 2 and V, = 7, (43) enables us to compute V, for all n. By induction, 
2 nl? 

(14) V, = n T(n])" 


(The only properties of the gamma function that are needed here are: 
T(1) = 1, xM(x) = T+ ),TR@ = Jz.) 
Finally, (9) and (14) give 


2n"/? 
(15) On-1 = T(n]2) ’ 


This research was partially supported by NSF Grant GP-24182. 


THE CHROMATIC POLYNOMIAL OF A COMPLETE BIPARTITE GRAPH 
J. R. SWENSON, University of Toronto 


We use [1] for all definitions in graph theory. In particular, we only consider 
finite undirected graphs without loops or multiple edges. A bipartite graph is one 
whose point set can be partitioned into two sets, U and V, such that every edge of the 
graph runs from U to V. It is complete if every point of U is connected to every 
point of V [1, p. 17]. If there are p points in U and q points in V, then we denote the 
complete bipartite graph by K,,,. Denote by /(G, t) the number of different colorings 
of the labelled graph G using t colors; /(G,t) is a polynomial called the chromatic 
polynomial of G [1, pp. 146 ff]. 

The chromatic polynomial of a graph has some general properties which are 
easy to determine, e.g., its coefficients alternate in sign [1, p. 147], and there is a 
general formula for f(G,t). The latter however, is difficult to evaluate in general. 
We offer here an argument for evaluating /(K,,,, t). 

Let K,,, be partitioned into its sets U and V and let there be t colors. Choose 
r colors. Let there be c(p, t,r) ways to color the p independent vertices of U using r 
colors chosen from the t colors. Then exactly t — r colors remain and these may be 
used in exactly (t — r)? ways to color the remaining vertices in V. Thus there are 


E o(p,t,r)(t—r} 


“ 
tl 
— 


ways to color K,,,. Our problem is to evaluate c(p, t, r). 

There are (') ways to choose r colors from t colors. The number of ways to 
Color the p distinguishable points with exactly these r colors is equivalent to dis- 
tributing p distinguishable balls into r distinguishable boxes leaving no box empty. 
By Riordan [2, p. 91] the latter can be done in exactly 


r\S(p,r) 


ways, where S(p,r) is a Stirling number of the second kind. Thus, 
c(p,t,r) = r!S(p,r) (*). 
We can simplify 
Y(t 
(1) > ( . r!S(p,r)(t — r)! 
r=1 
as follows. Let (t), = (t)(t — 1)---(¢ — r+ 1) be the “‘falling factorial’’. Then 


(i )n = (t),. 


197 


798 D. B. AICHELE [September 


Also, from Riordan [2, p. 33] we have (t—r)4 = X14, S(q,s)(t — r),. As (t),(t — r), 
= (t),,, we Can obtain after substitution in (1) 


Pp q 
f(Kp,@t) = 2 2 S(P,r)S@, 5) Or+s- 

The last expression shows the symmetry in p and g which must obtain since 
Ky,q = Ky, »- It also shows that the polynomial is of degree p+ q and, as each 
expression (t),,, is monic and S(p, p) = S(q,q) = 1, we have that the leading co- 


efficient of f is 1. 


References 


F. Harary, Graph Theory, Addison-Wesley, Reading, Mass., 1969. 
J. 


1. 
2. J. Riordan, An Introduction to Combinatorial Analysis, Wiley, New York, 1958. 


MATHEMATICAL EDUCATION 
EDITED BY J. G. HARVEY AND M. W. POWNALL 


Material for this Department should be sent to Shirley Hill, Department of Mathematics, 
University of Missouri, Kansas City, MO 64110, or to Paul Mielke, Department of Mathe- 
matics, Wabash College, Crawfordsville, IN 479338. 


TRAINING SECONDARY MATHEMATICS TEACHERS IN VENEZUELA 


D. B. AICHELE, Oklahoma State University 


As a visiting professor of mathematics.at the Universidad de Carabobo in Valencia, 
Venezuela, I had the opportunity to travel quite extensively throughout the country. 
Since my primary duties at Oklahoma State University are in the area of teacher 
training, I naturally took advantage of opportunities to learn of the Venezuelan 
approach to training secondary mathematics teachers. In this regard, this paper sum- 
marizes the policies of the government-supported Instituto Pedagogico Experimental 
de Barquisimeto, which is one of the eight institutions (6 government- and 2 privately- 
supported) charged with preparing secondary level teachers. Its program is fairly 
representative of secondary level teacher preparation conducted in Venezuela. 

Before looking at the actual teacher education program at Barquisimeto, I believe 
it is necessary to understand something of Venezuelan education in general. Although 
Venezuela is perhaps the richest and most educated of all the latin American coun- 
tries, it nonetheless has a literacy problem. In the fight against illiteracy, 1.5 million 
adults have been taught how to read and write during the past 10 years; but still 
11% of the 10.4 million Venezuelans are illiterate. 


PROBLEMS AND SOLUTIONS 
EDITED BY Emory P. STARKE 


ASSOCIATE EDITORS: JOSHUA BARLAZ, ERIC S. LANGFORD. COLLABORATING EDITORS: LEONARD 
CARLITZ, GULBANK D. CHAKERIAN, HASKELL COHEN, S. ASHBY FOOTE, ISRAEL N. HERSTEIN, 
Murray S. KLAMKIN, DANIEL J. KLEITMAN, ROGER C. LYNDON, MARVIN MARCUS, CHRISTOPH 
NEUGEBAUER, ALBERT WILANSKY, AND UNIVERSITY OF MAINE PROBLEMS GROUP: EARL M. L. 
BEARD, GEORGE S. CUNNINGHAM, CLAYTON W. DODGE, OSKAR FEICHTINGER, WILLIAM R. 
GEIGER, RAMESH GUPTA, GARY HAGGARD, PHILIP M. LOCKE, JOHN C. MAIRHUBER, CURTIS 
S. MORSE, GRATTAN P. MURPHY, EDWARD S. NORTHAM AND WILLIAM L. SOULE, JR. 


All problems (both elementary and advanced) proposed for inclusion in this Department should 
be sent to E. P. Starke, 1000 Kensington Ave., Plainfield, NJ 07060. Proposers of problems 
are urged to enclose any solutions or information that will assist the editors. Ordinarily, prob- 
lems in well-known textbooks and results in generally accessible sources are not appropriate 
for this Department. No solutions (except those accompanying proposals) should be sent to 
Professor Starke. 


ELEMENTARY PROBLEMS 


Solutions of Elementary Problems should be sent to Problems Group, Mathematics Department, 
University of Maine, Orono, ME 04473. To facilitate their consideration, solutions of elemen- 
tary Problems in this issue should be typed (with double spacing) and should be mailed before 
December 31, 19783. 


An asterisk (*) means neither the proposer nor the editors supplied a solution. 
E 2426*. Proposed by C. A. Long, Bowling Green University 


It is easy to show that the equilateral triangle can be inscribed in a square, and 
that a square can be inscribed in a regular pentagon. Can a regular pentagon be 
inscribed in a regular hexagon? 


E 2427*. Proposed by Harry Ruderman, Hunter College Campus School 


Suppose that 1 is written as a sum of n distinct Egyptian fractions. Find upper 
and lower bounds for the smallest fraction in the sum. 


E 2428. Proposed by M. S. Klamkin, Ford Motor Company 
If a; (i = 1,2,---,n) denote real numbers, show that 
nmin(a;) S$ La;— S S$ La; +S nmax(a,), 
where 


(n—1)S?= LZ (a,;-a,)* (S20) 


1si<jsn 


and with equality if and only if a; = constant. 


807 


808 ELEMENTARY PROBLEMS AND SOLUTIONS [September 


E 2429. Proposed by E. T. H. Wang, University of British Columbia 


Let k,, denote the least integer such that every n x n matrix of zeros and ones 
with exactly k, ones in each row and in each column contains a 2 x 2 submatrix 
without zero. Obtain a lower estimate for k,, and discuss the case of equality. 


E 2430. Proposed by John Masley, University of Illinois at Chicago Circle 


Let a and m denote natural numbers, and let ¢ denote Euler’s totient function. 
Euler’s generalization of Fermat’s ‘‘Little Theorem’”’ asserts that if (a,m) = 1, then 


(*) a’\™*! = a (mod m). 


Show that (*) holds if and only if the following: if p is a prime that divides a, then 
p*| a whenever p* | m. 


E 2431. Proposed by Alan McConnell, Howard University 


Consider a finite field F with elements a,,a,,--:,a,. Form the Vandermonde 
matrix V(a,,°::,a,) = (v,,), where v,; = (a,)** for i,j = 1,2,---,n (and where 
0° = 1). Find V~* and evaluate det V (where the operations are carried out in F). 


SOLUTIONS OF ELEMENTARY PROBLEMS 


A Quadrilateral Proportion 


E 1085 [1953, 551]. Proposed by Josef Langr, Prague, Czechoslovakia 


The perpendicular bisectors of the sides of a quadrilateral Q form a quadrilateral 
Q,, and the perpendicular bisectors of the sides of Q, form a quadrilateral Q,. Show 
that Q, is similar to Q and find the ratio of similitude. 


Partial Solution by Martin Thomas, Shirley, Southampton, England. Let 
ABCD be Q. Let A’, B’, C’, D’ be the circumcenters of BCD, CDA, DAB, ABC, so 
that A’B’C’D’ is Q,. Let A”B”C”D” be Q,. Now A’ and C’ lie on the perpendicular 
bisector of BD, and similarly B” and D” lie on the perpendicular bisector of A’C’. 
Thus B’D’ and BD are parallel. Also 4B and A’B”, BC and BC”, etc., are pairs of 
parallel lines. Hence triangles ABD and A”’B”D” are directly similar, as are also 
BCD and B’C’D". It follows that ABCD and A”B"C"D" are similar. 

To see that the ratio of similitude can be any positive real number, note. that if Q 
is a rhombus whose acute angle is 0, then Q, is a similar rhombus rotated through 90°. 
If their ratio of similitude is r, then r > 0 as 0 > 90° and r > «© as @— 0°. For 0 = 45°, 
r = 1 and Q and Q, coincide. 

In the general case an expression for the ratio of similitude of Q, to Q seems 
rohibitively involved. [A ‘‘reasonable’’ expression for this ratio is solicited — Ed. ] 


1973] ELEMENTARY PROBLEMS AND SOLUTIONS 809 
A Difficult Triangle Inequality 
E 2245 [1970, 652; 1971, 793; 1972, 1034]. Proposed by A. W. Walker, Toronto, 
Canada 


If A, B, C; a, b, c; s are the angles, side lengths, and semi-perimeter of any plane 
triangle, then 


(a + b +)*(s — a)(s — b)(s — c) & (a? + b? +c”)? cos Acos BcosC, 
III. Comment by Robert Breusch, Amherst College. We shall show how the 


following inequality of Van Tooren [1972, 1035, lines 7-8] (equivalent to Walker’s 
inequality above) does indeed hold for all nonnegative a, b, c: 


(a+b+c\(-—at+b+c\a—b+c\(a + b—c)[(abc)*(a + b + c)*—(a? + b? + c?)*] 
+ 8(abc)*(a? + b? +c”)? = 0. 


Assume that a, b, c are nonnegative and (without loss of generality) that a + b+ c=1. 
Let x = a*+b?+4+ cc? and y = abc. With a little rearrangement Van Tooren’s 
inequality becomes 


K(x, y) = 8x3 y? + (x* — y?)(2x + 8y—1) 20. 


Clearly 1/3 <x 1and0S y S$ 1/27, so that x* — y? > 0 and K(x, y) > 0 whenever 
1/2 <x $1; we can thus restrict our attention to those x which satisfy 1/3 S x S$ 1/2. 
Now ab + bc + ca = (1 — x)/2 so that a, b, c are the three zeros of f(t) = t? — ?? 
+ 4(1 — x)t — y and it is known that the polynomial t? — At* + Bt — C has three 
real zeros if and only if 


4B* — A?B? + 4A°C — 18 ABC + 27C* < 0. 


It follows that x and y must satisfy an inequality which, after some rearrangement, 
can be written in the form 


_ 2 __ 43 
y- 5 — 9x < (6x — 2) 
54 — 108? 


that is, y 2 m(x), where 


5—9x (6x — 2)°” 
54 108 


m(x) = 


Calculating the partial derivative, we see that 


= = 8(x* — 3y7) + 2y(8x° — 2x + 1), 


which is positive since x = 1/3 and y S 1/27. This means that K(x, y) 2 K(x, m(x)) 


810 ELEMENTARY PROBLEMS AND SOLUTIONS [September 


= F(x), so that if we can show that F(x) 2 0 for 1/3 < x S$ 1/2, we are done. 
Making the transformation z = t(x) =(6x—2)'/? so that x = t~+(z) =(z? + 2)/6, 
we see that 0 < z S$ 1 as 1/3 S x S 1/2, and that 


G(z) = F(t" *(z)) = 27437 °27(1 — z)?(72 + 32z + 492? + 323 + 72* — z°), 


which is clearly nonnegative for 0 < z < 1. 


Iterated Composition of a Function of Prime Factors of 1 


E 2356 [1972, 518]. Proposed by J. B. Roberts, Reed College 


If n is a natural number, define f(n) to be 1 plus the sum of the prime factors of n, 
each prime being counted according to its multiplicity. For example, f(12) = 8. 
Prove that if n is greater than 6, then the sequence of iterates n, f(n), f({(n)), -: 
contains an 8, and hence from some point on, must repeat: 8, 7, 8, 7, --:. 


Solution by Hans Kappus, Switzerland. Since f(n) 2 7 for n> 6 and f(7) = 8, 
it suffices to prove the following property of f(n): 

If n 2 9, then either n is composite and f(n) S$ n—2, or n is prime whence 
f(n) = n +1 (which is composite) and then f(f(n)) S f(n)-—2 = n-1. 

In fact, this is true for n = 9, so let us assume that it is true for all k such that 
9<k <n. Let n be composite, n = k, - k,. Then (k, — 1)(k, — 1) 2 4. Further- 
more, f(k;) Sk, +1 if either k, < 6 or k, is prime =7; otherwise f(k,) S k,;—2 
by assumption. Also 

f(n) = f(y) + f(ka) — 1. 
Hence f(n) S$ ky +1+k,+1- 1. But 
kK, t+1+k,+1-1 =— n+2—(k, —1)(k, — 1). 
Therefore f(n) Sn+2-—48n—-2. 

Also solved by Anders Bager (Denmark), S. Baskaran (India), Problem Solving Group of Berne 
(Switzerland), D. M. Bloom, Peter Bundschuh (Germany), R. J. Evans, Scott Forrest, Ray Glenn, 
Michael Goldberg, M. G. Greening (Australia), C. V. Heuer, W. M. Hill and his Linear Algebra 
Class, Wells Johnson, Vaclav Koneény, L. Kuipers, O. P. Lossers (Netherlands), Kevin McAvaney, 
Carolyn MacDonald, William Margolis, Helen M. Marston, Norman Miller, L. R. Nyhoff, M. R. 


Railkar (India), Eric Rosenthal, Steven Russ, Harry Sherman, Nan-Shan Shou, Allen Stenger, 
Walter Stromquist, R. K. Tamaki, S. J. Tillman, Mike Vitale, Charles Wexler, and the proposer. 


The Non-disjointness of Infinitely Many Sets having the Same Probability 


E 2362 [1972, 663]. Proposed by C. H. Kimberling, University of Evansville 
Suppose that in some probability space, E,,E,,-:- are events with common 
robability p. Let m = 2 be a fixed integer. Prove or disprove that 


1973] ELEMENTARY PROBLEMS AND SOLUTIONS 811 
p” S sup { P(E, NE; 0+ OF,,)}, 


where the supremum is taken over all m-tuples (i,,i,,-:-,i,,) of distinct natural 
numbers. 


Solution by Ellen Hertz, Bronx, N.Y. Let X, be the indicator variable of the 
event E;, i = 1,2,---. By Jensen’s inequality [see Parzen, Modern Probability 
Theory and its Applications, p. 434} 


E(X, ++: + X,)/n)") 2 p”. 


If we expand (X, +-:: + X,)” without collecting terms we obtain a sum of n 
monomials of which n(n—1):---(n—m+1) contain m distinct X,. Write 
(X, +°:: + X,)” = T;, + T, where T, consists of the products of m distinct X,; and 
T, consists of those with fewer. E(T,) is the sum of n” — n(n — 1)---(n—m +1) 
nonnegative terms each at most p so that 


0 < E(T,)/n" S (n™— n(n — 1)---(n—m + 1))p/n™ = o(1) as n> 0. 
Let S, be the set of all m-tuples (i,,-:-,i,,) of distinct natural numbers such that 


l1si;Sn,j= 1,2,-++,m. Then 


E(T,) = yu E(Xi, + X,,,) 


(its+**sim) ESn 


n(n — 1)---(n—m +1) sup E(X;,-:: X;,,). 


(ipors+sim) ESn 


IA 


It follows that 
p” S (E(T,) + E(T,))/n" S o(1)+ = =sup— E(X,, --- X;,). 


(itoe*sim) E Sn 


The conclusion now follows upon taking the limit as n > oo. 


Also solved by Janos Galambos, J. Gillis (Israel), J. C. Kieffer, Gérard Letac (France), and 
Andrew Odlyzko. 


Editorial Note. Gillis and Odlyzko give extensions based on an infinite version of Ramsey’s 
theorem. Galambos cites J. Galambos and A. Renyi, On quadratic inequalities in the theory of probabi- 
lity, Studia Scient. Math. Hungar., 3 (1968), 351-358 and J. Galambos, Quadratic inequalities among 
probabilities, Ann. Univ. Sci. Budapest, Eétvés, Sect. Math., 12 (1969), 11-16 which contain related 
results. 


An Integer Inequality 


E 2368 [1972, 772]. Proposed by C. V. Heuer, Concordia College 


Prove that if 1 <x, <x, <°::<xX,< yi < jy. <-+:: < y,, are integers such that 
Xx, = Ly, then [[x;>[]);. 


812 ELEMENTARY PROBLEMS AND SOLUTIONS [September 


Solution by Ivan Niven, University of Oregon. It is convenient to prove a little 
more, namely that the same conclusion holds if 


P<xX,<X%.S%3 85%, 8° SH <1 S250 Sm. 


Note that k 2 2 and k>m. Letting s = Lx,, the proof is by induction on k and s. 
For k = 2 we note that m = 1, x, 2 2, x, 2 3, s 2 5. Then x; + x2, 2 y, implies 
X1X2 > y, because x,X, —X, —X2 = (x, —1)(x,-—1)—1>0. This proves the 
result for k = 2, so now suppose the result holds for all k < j and look at the case 
k =j. To get a start on induction on s, we note that the least value of s is 3j — 1 
from the values x, = 2, x, =x; = + =x, = 3. Here[]x,=2-3/~' and y, 24 
for all i. By the inequality of the arithmetic-geometric mean, 


[] yi S C2 y/m)™ S (s/m)" = (Bj — 1)/m)™. 


Also s/m = Ly,/m = 4 and m S s/4 = (3j — 1)/4. Now with m so restricted, we 
prove easily that {(3j — 1)/m}" is an increasing function of m by differentiating its 
logarithm with respect to m to get log(3j — 1) — 1 — logm. This derivative is positive 
because sim 24> e gives 3; -1>me. Hence it suffices to prove 2- 3/71 
> {(3j — 1)/m}" for the largest possible m, namely m = (3j — 1)/4. Simple algebra 
completes this proof, and so the result holds for k = j and s = 3j — 1. 

Using induction on s with k = j, we look at some particular s > 3j — 1, assuming 
the result for smaller values of s. Note that x, 2 3 or x; 2 4 in all cases. 

In case x; 2 3 -we note that x, — 1, X%2,%3,°°+, Xj, Vis Va5°''s Vm—-19 Ym — 1 form 
a set of values satisfying the hypotheses (with y,, — 1 in perhaps an earlier position 
in the ordering of the y’s), and likewise the set X2,X3,°++,X;, V1» Vas °*'s ¥m-1- Applying 
the result in these two cases and adding, the proof is complete. 

In case x, = 2, then x, 2 4 and y,; 2 5 fori = 1,2,---,m. We note that the sets 
X1sX29 °° 9X p19 Xz — 1,1, Y20°t's Vm—1> Ym —1 and X19 %25°°*sXp— 19 V1>V29°°*s Vin — 1 
satisfy the hypotheses, and these give inequalities on products which again add to 
give the desired result. 


REMARK. The result is false if the x’s and y’s are assumed to be real numbers 
not necessarily integers, for example in the case x1 = 2,x, = 2.01, x3, = 2.1, y, = 3, 
y, = 3.1. However, under the assumptions eS x,5x,5:°:S%,<jy, 5S), 
<y3;S- Sy, and Lx; = Ly, then] |x; >[]y; for real numbers x; and ;. 


Also solved by Robert Breusch, Jordi Dou (Spain), Harry Lass, O. P. Lossers (Netherlands), 
Carolyn MacDonald, L. E. Mattics, Leo Ringwald, St. Olaf College Students, Wolfe Snow, Oto 
Strauch (Czechoslovakia), Jim Tattersall, Temple University Problem Solving Group, Louis Thurs- 
ton, Phil Tracy & Joe Mercer, P. H. Young, Alexandras Zujus, and the proposer. 


1973] ELEMENTARY PROBLEMS AND SOLUTIONS 813 
Just a Short (Random) Walk 


E 2369 [1972, 773]. Proposed by Harry Lass, California Institute of Technology 


For the two-dimensional symmetric random walk starting at the origin, show 
that the probability of reaching the point (1, 0) before reaching any other point on 
the line x = 1, is 1 — 2/z. 


Solution by Frederick Carty, Akron, Ohio. Let S,, be the set of all paths of m 
steps in length starting at the origin, ending at (1, 0) and not reaching the line x = 1 
in the first m — 1 steps. Let A,, be the number of paths in S,,. Note that no path 
starting at the origin can reach (1,0) in an even number of steps, i.e., S,, is empty 
and A,, = 0. 

To evaluate A,,-1, we first find the number of paths in S,,_, with 2j vertical 
steps. The number of ways the vertical steps can be arranged is (7) ). The number 
of ways the first 2n — 2j — 2 horizontal steps can be arranged. with no step to the 
right of the origin is (* - 7 ‘ (n —j) ‘as shown in Feller, Introduction to 
Probability Theory and Its Applications. The number of ways an arrangement of 2] 
vertical steps can be interspersed with an arrangement of 2n — 2j — 2 horizontal 


\ 


. {2n-—2 
steps is ( 2j ). Thus 


n=l an — 2 2j 
mo BCH 
j=0 J 


2n — 2j —2 1 
n—-j-1 


| 
i 
Cy Jami ® ") (5) 


_ 2n-1\? 1 
7 n 2n—1° 


Finally the desired probability pp, is given by: 


Po = E A, 473ml z(t) I Anant 


n=l n=1 n 2n —1 
_ $(lc3ie: @n- py 
yh 2+ 4 Qn 2n—1° 


This final sum equals 1 — 2/z as found in Jolley, Summation of Series, p. 73. 


Also solved by Ellen Hertz and by the proposer. 


ADVANCED PROBLEMS 


All solutions of Advanced Problems should be sent to J. Barlaz, Rutgers—The State University, 
New Brunswick, N. J. 08903. Solutions of Advanced Problems in this issue should be typed(with 
double spacing) on separate, signed sheets and should be mailed before December 31, 1973. Con- 
tributors (in the United States) who desire acknowledgment of receipt of their solutions are 
asked to enclose self-addressed, stamped postcards. 


An asterisk (*) means neither the proposer nor the editors supplied a solution. 
5922. Proposed by Paul Cohn, Bedford College, London, England 


A and B are two m x n matrices over an infinite field k such that rank (A + AB) 
< rank A for all Ae k. Find P and Q of orders m, n respectively such that 


B= PA + AQ, PAQ = 0. 
5923*. Proposed by Emeric Deutsch, Polytechnic Institute of Brooklyn 


Let ||-|| be a norm on the vector space C" of all n-tuples of complex numbers, 
and let A be an operator on C” such that | (A — al)7* i = r[{(A —al)~*] for each 
complex « which is not in the spectrum of A (r denotes spectral radius). Is 
| Al =a? 


? 
5924. Proposed by Donald Girod, Canisius College, Buffalo, N.Y. 


A standard exercise in an introductory algebra course is to show that no group 
G can ever be the set theoretic union of two proper subgroups H,, H,. It is possible, 
however, for a group to be the union of some finite number (>2) of proper subgroups. 
For example, Z@Z is the union of three proper subgroups. Characterize those 
Abelian groups G having a finite set of proper subgroups {H,,---,H,} such that 
G=H,U---UH,. 


5925. Proposed by A. G. O’Farrell, Brown University 
Show that the matrix (a,,), where a,; = 1/(1 +|j — i)), is positive definite. 
5926. Proposed by R. P. Boas, Northwestern University 
If f and g are nonnegative, bounded, and integrable over (— ©, 0), does it 
follow that 
[ suptatx—ostoldx 2 sup [a —nscnarr 


5927. Proposed by C. R. Johnson, National Bureau of Standards 


Find all convex subsets K of the complex plane C such that if L is any convex 
subset of C, then {zw: ze K, we L} is convex. 


814 


SOLUTIONS OF ADVANCED PROBLEMS 
Compactness and Open Covers 


-5850 [1972, 399]. Proposed by R. K. Tamaki, California State College at Los 
Angeles 


Let X be metrizable. Prove that X is compact if and only if, for every metric d 
for X, every open cover {U,} of X has a Lebesgue number A > 0 (i.e., we require 
that each d-ball B,(x, 4) is contained in some U,). 


Solution by A. A. Jagers, Twente University of Technology, Netherlands. 
Suppose first that X is compact. Fix a metric d for X and, given an open cover 
{U,} of X, choose for each x € X a number 6(x) > 0 such that B,(x, 26(x)) cU, for 
some a. This yields an open cover {B,(x,6(x))} which contains, in view of the 
compactness of X, a finite subcover {B,(x,, 6(x;)):1< isn}. Now one easily 
checks that 1 =min {6(x,): 1 Si Sn} is a positive Lebesgue number for {U,}. 

Conversely suppose that, for every metric d for X, every open cover {U,} has a 
Lebesgue number 4>0. Then, X being metrizable, it suffices to show that X is 
sequentially compact. To do so, suppose that (a,),°., is a sequence of points in X 
without convergent subsequence and with a, £a,, for n#m. Then any subset of 
A= {a,:n=1,2,---} is closed. Now two possibilities arise. (i) But for a finite 
nt ‘aber, all a, are clusterpoints of X. (ii) There exists an infinite subsequence 
(b,)°-, of (a,),%-, such that all b, are isolated points of X. In the first case there 
exists for each n a real number ¢, > 0 such that lim,.,,e, = 0, B,(a,,2e,) € B,(an,€,) 
and a, ¢B,(a,,€,) for m#n and it follows at once that {B,(a,,¢,): n =1,2,-+-} 
U {X\A} is an open cover with no positive Lebesgue number. A contradiction, In 
the second case, set B= {b,:n=1,2,---} and consider a new metric d, given by 
d,(x,y) = d(x, y) [1 + d(x, y)]-* if x,y¢ B, d,(x,b,)= d,(b,,x) =1 if x¢B, and 
d (by 0m) = 2| n-~'— m7! |. Then d,; is also a metric for X and this time 
{B,(b,,n~*): n =1,2,---} U{X \B} is an open cover of X with no positive Lebesgue 
number. This second contradiction completes the proof. 


Also solved by O. P. Lossers (Netherlands), Simeon Reich (Israel), and the proposer. A number 
of incomplete arguments were received. 


Translates of Sequences and the Cantor Set 


5851 [1972, 399]. Proposed by Douglas Lind, Stanford University 


Is there a bounded sequence of real numbers each translate of which has only 
finitely many terms in the Cantor set? 


Solution by Don Coppersmith, Massachusetts Institute of Technology. Such 
sequences do exist. We prove that no translate of the sequence x, = 13/3" has more 
than two elements of the Cantor set. 


815 


816 ADVANCED PROBLEMS AND SOLUTIONS [September 


Recall that the Cantor set K is the set of numbers between 0 and 1, inclusive, 
which have ternary representations consisting entirely of 0’s and 2’s. Notice that 
1/3 = (0.10000---), = (0.022222 ---), is an element of the Cantor set. For consistency 
we shall always choose the first representation (that with a string of 0’s) when 
presented with such an ambiguity. This is the only instance where a 1 can occur in 
the ternary representation; it must be followed by all 0’s. 

Given y, assume that there are three values of n, L<M < N, such that y+ x,eK. 
Consider the difference between (y + x,) and(y + xy), two members of K. Depending 
on the value of N — L, this difference is one of the following (all arithmetic in the 
base 3): 


TABLE I 111 111 1110---0 
—lil1 —111 —0---O111 
0222 10212 1102---2112 


Here ‘‘0---0’’ represents a string of 0’s of arbitrary positive or zero length, similarly 
for ‘‘2---2’’, The differences are all to be divided by 3%, i.e., the ‘“‘decimal’’ point 
is to be inserted. 

These numbers can be differences between elements of K in only the following 
ways: 


TABLE If xx0000yy xx0001 zz xx2000yy xx02 -++2001zz 
+0222 +-0222 +0222 +0222 
xx0222yy xx 1000zz XX2222 yy xx10-+-0000zz 
xx02020yy xx02021zz xx02220yy xx02221zz 

+10212 +10212 +-10212 +10212 
xx20002 yy xx20010zz xx20202 yy xx20210zz 


xx020 0---0020yy xx022 0-+-0020yy xx020 0---0021zz 
+110 2---2112 +110 2--+2112 +110 2---2112 


xx200 2-++2202yy xx202 2---2202yy xx200 2-++2210zz 


xx022 0---0021zz 
+410 2---2112 


xx202 2°++2210zz 


Here xx and yy represent arbitrary strings of 0’s and 2’s: zz represents an infinite 
string of 0’s; 0---0 represents a finite (possibly null) string of 0’s, and similarly for 
2-:-2. The top number in each case represents y + xy and the bottom is y + x;,. 
The rightmost explicit digit in the top number before the yy or zz is in the Nth 
ternary place. 

Now do the same for the pair (y + xy) and (y + x; ). We get another addition 


1973] ADVANCED PROBLEMS AND SOLUTIONS 817 


from table II with, again, the top line representing y + x,y, with the rightmost ex- 
plicit digit being in the Nth place, Since L and M are distinct, we have used two 
distinct differences from Table I (possibly different lengths of ‘‘2---2’’ in the third 
example), and thus two distinct additions from Table II. But this means that two top 
lines from Table II represent the same number y+ x,y in the same orientation. 
Examination of the various top lines shows this to be impossible. Thus the proof. 

Notice that if y = 41/81, then y + 13/81 = 2/3 and y + 13/27 = 80/81, both 
Cantor numbers. So there exist translates which contain two Cantor numbers, but 
none with three or more. 

This is the best possible in that differences between Cantor numbers comprise 
the entire closed interval [— 1,1], and any bounded sequence has two elements 
within one unit of each other, so that some translate of that sequence will carry both 


of these elements into the Cantor set. 

The related question, whether there exists a sequence each translate of which 
(through less than one unit, say, to avoid trivialities with boundedness) intersects the 
Cantor set in at least one point, can be answered negatively using measure theory, 
for such a construction would yield a covering of the unit interval by a countable 
number of copies of the measure-zero Cantor set. 


Also solved by D. Borwein, J. W. Grossman, Nicholas Passell, Konrad Victor (Israel), and the 
proposer. 


Completely Monotonic Functions 


5852 [1972, 400]. Proposed by C. H. Kimberling, University of Evansville 
Suppose f carrying [0,00) onto (0,1] has alternating derivatives: 
(-—1)f20, k=0,1,-. 
Prove g(x) = (1 — f(x))/x has alternating derivatives on (0,0). 


I. Solution by Fred Schuurmann, Miami University, Ohio. Actually we may set 
g(x) = (f(0) — f(x))/x, where f(x) maps [0, 00) onto (0,f(0)]. It is shown by in- 
duction that 


190) = — FC) + ka}, CQ = - LO 


k =1,2,---. Thus we have a first order linear differential equation in g“~ ?(x) 
and the solution is given by 


g®— (x) = - = | tk f(a) dt. 
0) 


Therefore g has alternating derivatives. 


818 ADVANCED PROBLEMS AND SOLUTIONS [September 


II. Solution by O. P. Lossers, Technological University, Eindhoven, the 
Netherlands. Since f'(x) $0 and f is onto, f(0) =1. By nye formula 


ty" xk (— (—1)"**x"* 


1=/O=So =x) = E Pp my + CO porno, 


where 0 < @ <1. Hence 


9%) = E (2) 0 -£6%(— yr Mn = 9! 
k=0 
__4yk yk 
_— ( yr | ( oo f(x) —]s= ~ of (63). 


which proves the assertion. 


Also solved by K. F. Andersen, Frederick Carty, Robert Heller, A.C. Hindmarsh, R. B. Kirk, 
Eitan Lapidot (Israel), B. E. Rhoades, Steven Russ, R. P. Soni, P. H. Young, and the proposer. 


Order of Products of Elements in an Abelian Group 


5853 [1972, 400]. Proposed by Gomer Thomas, University of Washington 


Let x and y be elements of a finite Abelian group G with orders m and n res- 
pectively. Let q be the order of <x) Ny), the intersection of the cyclic subgroups 
generated by x and y. Give the possibilities for the order of xy, in terms of m, n, 
and q. 


Solution by the proposer. Let | be the least common multiple of m and n. If z is 
any collection of primes and h is any integer, let h, be the (unique) integer defined by: 
(i) h, divides h, (ii) every prime factor of h, is in x, and (iii) no prime in z is a factor 
of h /h,. Let o be the set of primes which divide gq but do not divide either | /m or I/n. 

We shall prove that k can occur as the order of ab, k =o(ab), if and only if k is 
of the form k =1/s, where (1) s divides q,, and (2) if 2 | q,, then 2 |s. 

First, we show that o(ab) must be of this form. Let x =a™4 and y = b”9 
Both x and y are generators of <a) N <b>, so y = x’, for some integer r satisfying 
(r,q) =1. The smallest integer t such that (ab)’€<a) N<b> is t=I1/q, so o(ab) 
= (I /q)- o((ab)"). Now (ab)/4= x//™y'"= xt" Since o(x) = q, we have o((ab)"") 
= q/s, where s=(I/m+rl/n,q). Since (l/m,1/n)=1 and (q,r)=1, no prime 
dividing both q and one of | /m or |/n can divide |/m + rl/n. Hence s divides q,. If 
2 divides q,, then //m, | /n, and r are all odd, so 2 divides s. 

Now let us show that any number of the specified form can occur as o(ab). 
Specifically, let m, n, q, s be positive integers with m, n arbitrary, q a divisor of 
(m,n), and s satisfying (1) and (2). Given k = I /s, let x be the set of primes dividing s, 
p be the set of primes dividing gq, but not s,and t be the set of primes dividing q but 


1973] ADVANCED PROBLEMS AND SOLUTIONS 819 


not q,. Since (I /n, q,) = (l/n, g,) = 1, there exist integers t, and t, such that t,l/n =1 


(mod q,) and t,//n =1 (mod q,). 
By the Chinese Remainder Theorem, there is an integer r which simultaneously 
satisfies the congruences: 


r=1(modq,), r=t,(s—Il/m) (modq,), r=t,1/m (mod q,). 


It follows from these that (r, q,) = (", q,) = (7, 4.) = 1, so (r, q) = 1. Moreover, the 
last two congruences give: 


l/m+rl/n=s(modq,) and [/m+rl/n = 2l/m (mod q,), 


which implies that (l/m + rl/n,q) =s. 
Now, let G= <u> x <v>, where o(u) =! and o(v)=n/q. Let a=u"™ and 
b =u"""y, Then o(a) = m, o(b) = n, 


o(<a) A<by) = q, and o(ab) = 1/s. 


Also solved by J. W. King, and by Brian Wesselink. 


m-tuples and Their Branchings 


5854 [1972, 523]. Proposed by Stephen Gelbart, Princeton University 


Given a decreasing sequence of integers k,,---,k,, a branching is a sequence of 
integers k’,, ---,ki4, with k, = k; = k,,,. Upon successively branching n — 1 times one 
obtains a single integer; one calls a sequence of n— 1 successive branchings a 
complete branching. Show that there are 


I] (&-k+i-d)/G-d)] 


1Si<jsn 
distinct complete branchings of a given sequence {k,}. 


Solution by Leonard Carlitz, Duke University. Let N(k,,-:-,k,) denote the 
number of distinct complete branchings of the sequence {k,,---,k,,}. Since 


A= |(, 8 | Gia tm, 


1si<jSn L—J 1— J 


the stated result is equivalent to 


N(k,, Kk) = (- 1" D2 


(=) 


k,;-j-1 oe 
( i-1 )| (i,j = 1,2,+++,n). 


(1) 


_ (— 1)""- 1)/2 


820 ADVANCED PROBLEMS AND SOLUTIONS 
For n=2 we have 


N(k,,k,) = » 1=k,-—k,+1 


kizk, 2ke 
in agreement with (1). Also it is evident from the definition that 
(2) N(kissy Kus Kuti) = 2 N(ki, ky)» 
where the summation is over all kj, ---,k;, such that 
(3) ky 2kj 2k, 2+ 2k, 2k, 2 kyyy. 


Assume that (1) holds up to and including the value n. Then, by (2) 


N(kis sk Knes) = (= 1" P? (27) I 


where the summation is over all k},---,k, satisfying (3). Hence 


eure) 
coef) POETICS) 


N(k,, Kus Kn+d = (— 1yrn— 9/2 


eo) PS) tern) 
my a) (en) 


(“= 7) (i,j =1,2,+--,n +1). 


_ (— 1)" 1)/2 


_ (— py" 1)/2 


This completes the induction. 


Also solved by Richard Stanley, and by the proposer. 


Editorial Note. Stanley derives the formula from Theorem 15.3 in his paper, Theory and applica- 
tion of plane partitions, Studies in Applied Math., 50 (1971), pp. 167-188, 259-279. 


THE AMERICAN 
MATHEMATICAL MONTHLY 


(FOUNDED IN 1894 BY BENJAMIN F. FINKEL) 
THE OFFICIAL JOURNAL OF 


THE MATHEMATICAL ASSOCIATION OF AMERICA 


VOLUME 80 NUMBER 8 
CODEN: AMMYAB 

CONTENTS 
On the Theory of Interest... . . . Davin GALE 853 


The Nesting and Roosting Habits of the Laddered Parenthesis 
R. K. Guy ann J. L. “SELERIDGE 868 
Correction to “The Math. Societies and Associations in the U.K.”’ THOMAS WILLMORE 876 


Squaring Rectangles and Squares . . N.D.KAZARINOFFAND ROGER WEITZENKAMP 877 
What Every College President should Know about Mathematics . J.G.KEMENY 889 
Some Mathematical Verses. . . . ©. . . . eee 902 
Remarks on Women in Mathematics . . ASSOCIATION FOR WOMEN IN MATHEMATICS 903 


MATHEMATICAL NOTES 


A Note on Catalan Parentheses . . . . . . . . . +. . JOHN RIORDAN 904 
Determination of the Riemann Function... . . . .. +. +. E. J. Scotr 906 
Inequalities for the Area of Two Triangles. . . . . . . L. Cariitz 910 
A Banach Space Characterization of the Space of Affine Continuous Functions on a 
Compact Convex Set . ... Coe ee ee) UP. UD TAYLOR OT! 
On Lattice Points Inside Convex Bodies toe eee) CUAL MM. ODLYZKO 915 
Increasing Continuous Singular Functions . . . . .  .  . GERALD FREILICH 918 


RESEARCH PROBLEMS 
Questions on a Sequence of Ulam . . . . . .. . ~~. +~+BERNARDO RECAMAN 919 
Equitable Coloring . . . . . . . . «WALTER MFYER 920 


CLASSROOM NOTES 
The Lagrange Multiplier Rule. . . E. J. MCSHANE 922 
The Mini-Max. Property of the Tychonoff Product Topology . . D.E. CAMERON 925 
Another Approach to the Cubic Interpolating Spline . . . B. H. ROSMAN 927 


(Continued on inside cover) 


OCTOBER 1973 


MATHEMATICAL EDUCATION 


On Behavioral Objectives in Mathematics Education L. C. JANSSON AND R. T. HEIMER 930 
Teaching a Computer-Oriented Laboratory Course for Ordinary Differential Equations 

. H. E. WILLIAMS AND DELMER DE BOER 933 
Multiple- Choice Examinations it in Mathematics, not Valid for Everyone 


ELEMENTARY PROBLEMS ANI) SOLUTIONS 

ADVANCID PROBLEMS AND SOLUTIONS 

REVIEWS , 

NEWS AND Notices ; Doe 
MATHEMATICAL ASSOCIATION OF AMERICA 

March Meeting of the Southeastern Section 
March Meeting of the Southern California Section 
April Meeting of the Iowa Section 

April Meeting of the Nebraska Section . 

April Meeting of the North Central Section 

April Meeting of the Ohio Section , 
April Meeting of the Oklahoma-Arkansas Section . 
May Mecting of the Hlinois Section . 

Calendars of Future Meetings . 


. JERRY SILVER AND BERT Watts 937 
942 
949 
953 
969 
969 
969 
971 
972 
972 
973 
973 
974 
975 
976 


NOTICE TO AUTHORS 


Specialized research is usually unsuitable; see Statement of Policy (vol. 76, p.2). Manuscript preparation: Please 
use the Manual for Monthly Authors (vol. 78, p. 1) and follow the format in current issues of the MONTHLY. 
Manuscripts should be typewritten, triple-spaced with wide margins; submit two copies and keep one for 


protection against loss. 


Backlog: Main Articles 12 months, Math. Notes 13 months, Research Problems 7 months, Classroom Notes 


11 months, Math. Education 10 months. 


EDITORIAL CORRESPONDENCE AND MAIN ARTICLES: to ALEx ROSENBERG, Department of Mathe- 


matics, Cornell University, Ithaca, N.Y. 14850; NOTES, etc.: 


to th2 corresponding Associate Editor; 


ADVERTISING CORRESPONDENCE: to RAouL HAILPERN, Mathematical Association of America, 
SUNY at Buffalo, Buffalo, N. Y. 14214; CHANGE OF ADDRESS and SUBSCRIPTIONS: to A. B. 
WILLCox, Mathematical Association of America, 1225 Connecticut Ave., N.W., Washington, D.C. 20036. 


HARLEY FLANDERS, Editor 
ALEX ROSENBERG, Editor-Elect 
ASSOCIATE EDITORS 


JOSHUA BARLAZ J. G. HARVEY SEYMOUR SCHUSTER 
E.R. BERLEKAMP ERIC S. LANGFORD J. ARTHUR SEEBACH, Jr. 
JANE W. DI PAOLA P. D. LAX E, P. STARKE 

ROBERT GILMER ARTHUR MATTUCK LYNN A. STEEN 
RICHARD GUY M. W. POWNALL JAMES WENDEL 

RAOUL HAILPERN GIAN-CARLO ROTA 


Annual dues for members of the Association (including a subscription to the American 
Mathematical Monthly) are $12.50. For nonmembers the subscription price is $18.00. 


PUBLISHED BY THE ASSOCIATION at Washington, D. C., and Menasha, Wisconsin, during the months of January, 
February, March, April, May, June-July, August-September, October, November, December. 


Second-class postage paid at Washington, D. C., 


and additional mailing offices. 


Copyright .c) The Mathematical Association of America (Incorporated), 1973 


PRINTED IN THE UNITED STATES OF AMERICA 


ON THE THEORY OF INTEREST 
DAVID GALE, University of California, Berkeley 


1. Introduction. Every science is concerned with a certain set of phenomena. 
The ‘‘theory’’ is that part of the science which tries to explain why the given phenom- 
ena occur. Some scientific theories, even very profound ones like the theory of 
evolution, can be quite adequately presented in the language of ordinary discourse. 
But there are other theories which really cannot be understood except in the most 
superficial way without the use of mathematics. The man-in-the-street no matter how 
intelligent he is cannot be expected to get any real comprehension of, say, the theory 
of planetary motion, to say nothing of such things as quantum mechanics or 
relativity, without knowing some mathematics. 

Economic theory, as practiced in the past by the great theorists (Marshall, Marx, 
Keynes, etc.) has been essentially a verbal theory. In teaching economic theory today 
there is an increasing tendency to ‘‘use’’ mathematics, but most of the basic ideas 
can still be conveyed by purely verbal arguments. An example is the classical theory 
of price formation. Since it will be needed later let me recall briefly how it goes. The 
price of any good is determined in such a way that the demand for it will be equal 
to its supply. One argues that if the good is priced too high people will refuse to buy 
it and sellers finding themselves with inventories rotting on the shelves will lower 
their prices. Similarly, if the price is too low sellers will find themselves constantly 
running out and will raise prices. This is really all one has to understand in order to 
grasp the essential notion of so-called equilibrium prices. (In comparatively recent 
times using quite advanced mathematics people have succeeded in proving that such 
equilibrium prices will exist under suitable conditions. However, it is clearly not 
necessary for a person to know the Brouwer fixed point theorem in order to compre- 
hend the ‘“‘law of supply and demand’’.) 

When one deals with economic phenomena which are a little more complicated 
than the simple one just described it may not be possible to come up with verbal 
man-in-the-street explanations. The next level of economic complexity, it seems to 
me, is what is called ‘‘the theory of interest’? which, as we shall see, is really the 
theory of the behavior of prices over time. This theory attempts to explain why the 
institution of interest has been present in virtually all economic systems. This is 
actually a rather puzzling fact, for throughout history many of the ‘“‘great thinkers”’ 
including Aristotle, Thomas Acquinas and, of course, Marx have condemned interest 


David Gale received his Princeton Ph.D. under A. W. Tucker. He was at Brown University before 
joining the University of California, Berkeley. He held a Fulbright Research Scholarship at the 
University of Copenhagen, spent a year’s leave at the RAND Corporation, held a Guggenheim 
Scholarship at Osaka University, and an NSF Senior Postdoctoral Scholarship at the University of 
Copenhagen. He is a Fellow of the Econometric Society. His research in mathematical economics 
includes the book The Theory of Linear Economics Models. Editor. 


853 


854 DAVID GALE [October 


taking as unjustifiable and a form of extortion. The practice has often been attacked 
on moral and even religious grounds (Jesus drove the money lenders from the 
temple). Nevertheless according to Irving Fisher, author of a classical treatise on the 
subject, “‘despite all attempts to prohibit interest taking there is not and never has 
been in all recorded history any time or place without the existence of interest.’’ 


Today most economists (but not all) agree that the explanation for interest is in 
fact the same as that for prices. It is necessary to have (positive) interest rates in order 
to balance supply and demand. Of course, demanding exhorbitant interest is un- 
justifiable just as it would be unjustifiable to demand an exhorbitant price for food 
or shelter, but—and this is probably less obvious—an interest rate which is too low 
may be just as harmful to an economy as one which is too high. The object of this 
essay is therefore two-fold, first to show, as convincingly as possible, why positive 
interests rates are required for the proper functioning of an economic system. This 
‘‘qualitative’’ result is Theorem 2 of Section 5. We then examine a quantitative 
question. It turns out that the interest rate for an economy is what physical scientists 
calla ‘‘pure’’ or dimensionless number, meaning that its value is unchanged when we 
change the units in which other quantities are measured. This means it must be 
measuring something about the state of the economy. In Section 6 we describe 
exactly what this something is. Very roughly, an economy is shown to have the 
interest rate r if it is in a state where by sacrificing one unit of ‘‘consumption’’ in the 
present it can achieve a permanent increase of r units of consumption in every period 
thereafter. 

I should perhaps say a word about why these results may be of interest to 
mathematicians. It seems to me there are two more or less opposite reasons for 
treating applications in a mathematics journal. The first is to show how “‘interesting 
mathematics’? comes up in connection with some applied problem. There exists by 
now a fairly sizable literature of this kind where the coup de grace may involve an 
ingenious use of a fixed point theorem or some nonelementary measure theory. In 
the present exposition we are working the other side of the street. It is the problem 
which is of principal interest and one uses whatever mathematical methods seem 
appropriate for it, without regard to their aesthetic appeal. The analysis in the 
present case involves nothing more exotic than separating a pair of convex sets by a 
hyperplane at the critical moment. This is the type of mathematics that has dominated 
applications in economics throughout the ‘‘modern period’’ starting with Ville’s 
proof of the von Neumann minimax theorem. In any case the reader should not 
expect to see any mathematical rabbits pulled out of hats in the pages to follow and 
he should work through the succeeding analysis only if he is sufficiently interested in 
understanding what an interest rate is and why it occurs. (I should say that if one 
goes into the interest question in a more general setting, the mathematics picks up 
considerably. For example, the analysis in [1] and [3] is based on an elegant gener- 
alization of the Perron-Frobenius Theory of positive matrices.) 


1973] ON THE THEORY OF INTEREST 855 


Finally there is the bibliographical question: whose theory is this anyway, and 
is it “‘correct?’’ In answer to the first question let me hasten to say, it’s not mine. 
The subject of interest theory is possibly as old as economic theory itself, and I 
would hate to have the task of tracking down all the antecedents to the ideas to be 
presented here. In some ways, however, the present theory which has evolved almost 
entirely over the past twenty years is rather different from that of the classical writers. 
For example, one of the things the modern theory has made clear is the important 
relationship between the interest rate and the population growth rate (the former 
will in general exceed the latter as we shall see) a fact which the classical writers 
were apparently not aware of. As for the current literature my own education and 
involvement in these questions grew out of reading [1] and [2]. A book which is 
closely parallel to our formulation of the quantitative result is [4]. At least a dozen 
other papers over the past two decades have touched on one aspect or another of the 
present exposition. The particular package presented here however, has, as far as 
I know, not been given elsewhere. 

As for correctness of the theory, questions in economics are never definitively 
settled the way they are in the physical sciences. People can always exhibit situations 
and circumstances under which any proposed theory breaks down, and I’m sure 
there are some economists who would dispute the ideas put forward here and claim 
that it is absurd to talk about interest in models which exclude such factors as say 
money or uncertainty about the future. My own feeling however, and I do believe 
many economists would concur, is that the theory to be described in the following 
sections gives about as convincing an explanation of the phenomenon under study 
as one can hope to get in an area as complicated as economics. 


2. The laws of motion. Scientific theories often start with certain assumptions, its 
‘laws of motion’’, and then proceed to deduce consequences of them. We shall 
follow this classical pattern and present here our basic economic laws, of which 
there are just two. Of course one cannot expect economic laws to hold exactly and 
invariably like, for example, the inverse square law of gravitational attraction, but 
if one is to accept the analysis which follows he must be willing to believe that, by 
and large, over sufficiently long periods of time these laws give a roughly accurate 
picture of economic reality. The first law has already been mentioned. 


(1) The supply and demand for each good in the economy are equal. 


We need not be concerned at present about the precise definition of supply and 
demand. Their meaning will be clear when we examine specific models. The idea is 
simple enough. If we think of the people in an economy as being producers and 
consumers we are requiring that the goods producers decide to produce shall be the 
same as the ones consumers want to consume. Again, the law is probably never 
satisfied exactly. On the other hand there is a sort of automatic feedback mechanism 
which assures that supply and demand will not remain too far apart, for if a good is 


856 DAVID GALE [October 


seen to be, say, oversupplied for any length of time the situation will be corrected 
either by producers supplying less of it or by lowering its price to increase demand. 
Please note that these considerations are completely independent of the type of 
economic system under consideration. Indeed, the laws of motion and in fact the 
whole theory to be presented here is ‘‘universal’’ in that it applies equally to a centrally 
planned socialist or laissez-faire capitalist economy, or anything in between. 

The first law concerns only quantities of goods and would apply even to an 
economy without a price system. In order to state the second law we must assume 
that each good in the economy has a price and that goods are produced from other 
goods. 


(I) Among all possible methods of production, producers will choose those 
which maximize profits. 


At first glance this law may seem debatable. In a capitalist economy one might 
expect producers to maximize profits since presumably they will get to keep at least 
part of them, but why should the law hold in a planned economy in which profits 
revert to the state? The situation becomes more transparent if we rephrase the law 
to say that among all ways of producing a given commodity, producers should 
choose the one or ones whose cost is minimum. Stated in this way the law is hard to 
argue with, for consider the alternative. Why would authorities in a planned economy 
instruct producers to use a costly method of production when less costly ones were 
available?* Presumably the planners control the prices as well as the choice of 
production technique and it hardly seems reasonable that these should be at “‘cross 
purposes’’ with each other. 


3. Interest rates: The Equivalence Principle. What is an interest rate? Probably 
the most immediate way people think of interest is as a price one must pay for 
lending or borrowing money. Thus, to say that the interest rate is r percent per year 
means that if a person borrows (lends) d dollars he pays (receives) rd dollars per year 
throughout the duration of the loan. For our purposes it will be convenient to use a 
slightly different but equivalent definition based on what we call the Equivalence 
Principle. 

Let us suppose there are n goods in the economy and the price of the ith good is p 
this year and will be p; next year. 


EQUIVALENCE PRINCIPLE: The following two economic worlds are equivalent: 

(A) Money saved earns interest at the rate r per year. 

(B) Money saved earns no interest but the price of the ith good next year is 
p;/A +71) rather than p;. 


* Curiously, some recent criticism of (II) has been aimed at its relevance for a capitalist rather 
than socialist economy. J. K. Galbraith seems to be claiming that large firms deliberately use non- 
profit-maximizing techniques in order to increase their scale of operation. Other economists, however, 
consider this a rather dubious proposition. 


1973] ON THE THEORY OF INTEREST 857 


In order for the reader to be convinced of this equivalence he need only think 
about it for a moment. Clearly the only thing a person who saves is concerned with 
is the purchasing power of his savings (we exclude consideration of misers). One sees 
at once that in either situation (A) or (B) the amounts of goods he can get for each 
dollar he saves is the same. Similarly, if he borrows a dollar to buy goods this year 
then in case (A) he repays (1 + r) dollars next year while in case (B) he repays only the 
original dollar which, however, has now become worth (1 + r) units of goods. But 
though the (A) and (B) worlds are equivalent, it is for many purposes much more 
convenient to adopt the (B) point of view. We illustrate this by a simple example 
which is standard in the theory of finance. For convenience we work with the interest 
factor p=1+,r rather than with r itself. 

Suppose an individual anticipates a sequence of payments of money, his income, 
ip, 11, °°, i, over the coming n + 1 years. During that same period he plans to make a 
sequence of expenditures é, e,, ---, e,. He knows he can borrow or lend money at the 
interest rater and the question he asks himself is whether through a suitable program 
of borrowing and lending he can pay for the expenditure sequence (e,) from his 
income sequence (i,). If the answer is yes, we call the expenditure sequence financially 
feasible. The direct approach to answering this question would go as follows: In 
year zero the planner gets ip and spends ey. His savings is therefore so = ig — eg, 
where of course sy can be negative, meaning that he must borrow in order to make 
his initial purchases. In either case, whether positive or negative, s,) draws interest so 
his total wealth entering year 1 is pSy + i, of which he spends e,. Hence, his savings 
in year 1 is s; = pSy + i, — e,, and so on. We thus have the system of equations 


(3.1) 


PSy—-1 1 Uy = Cy H Sp. 


Now the condition that the expenditure sequence be feasible is by definition 
precisely that the number s, be nonnegative, meaning that the planner is ‘“‘solvent”’ 
at the end of the n year period. In order to see whether this will be so we multiply the 
jth equation of (1) by p~/ and sum, noticing that all the s; terms, except s,, cancel 
out, and we get 


(3.2) 2 i;/p’ = 2 e;/p' +s, /p", 
=0 i= 


jJ=0 

and we see that s, is nonnegative if and only if 

(3.3) XY e/p?< LX i,/p’. 
=0 =0 


In economic terminology, the terms in the inequality are referred to as the present 


858 DAVID GALE [October 


value of the expenditure sequence and the income sequence respectively. We then 
have a very simple criterion for feasibility which says that the expenditure sequence 
is feasible if and only if its present value is less than that of the income sequence. 

But now let us look at the same problem making use of the Equivalence Principle. 
There are now no interest payments. Instead all prices change from one year to the 
next by the factor 1/p. If we think of the planner’s income sequence as being his 
salary then since the price of his labor is falling at the rate 1/p his income sequence 
iS (ip, i, /p,---,i,/p") and similarly the costs of the things he wants to purchase will 
be falling at the rate 1/p, so the expenditure sequence becomes (o, e,; /p, --:, e, /p"). 
Now inequality (3.3) is simply the definition of feasibility, i.e., the requirement that 
expenditures shall not exceed income, and no calculations are necessary! 

Henceforth we shall change over completely to the (B)-world formulation. The 
prices p,/p are sometimes called present value prices or real prices since they give 
the real value of next year’s goods as compared with this year’s. (The original 
numbers p; are called money prices.) We see now that a positive interest rate cor- 
responds to falling real prices, and this allows us to phrase the interest rate problem 
in the following simple form. Why under normal conditions do real prices fall? (The 
distinction between real and money prices is, of course, crucial. As we are all too 
painfully aware, money prices have in recent years been rising, not falling, in most 
countries, but this ‘‘inflation rate’’ has been substantially below the ‘“‘prime’’ rate 
of interest, so that the real interest rate has continued to be positive. Of course there 
have been historical cases of “‘runaway inflation’’ in which even real prices rise 
giving a negative real interest rate.) Our objective will be to show that if an economic 
system obeys the simple laws of the previous section then the real prices determined 
by these laws will in general have to decrease with time. Now if the economy itself is 
changing then one would expect prices to change too. The thing that is rather sur- 
prising, however, is the fact that even when the economy does not change in any way, 
the prices will. Specifically, we shall analyze economies in ‘‘steady states’? in which 
quantities of goods produced and consumed remain the same period after period. 
When we calculate prices for this situation as determined by the two laws of the 
previous section it will turn out that the prices unlike the physical quantities do not 
remain constant but in general fall. This is the content of our first main theorem 

(Of course the economic world in which we live today is probably not even 
approximately in a steady state so one might argue that the theory presented here 
has no “‘relevance,’’ to use the popular term. But anyone familiar with scientific 
method will realize that this kind of criticism is beside the point. It would be like 
criticizing Galileo’s assertion that all bodies fall at the same rate in a vacuum on the 
grounds that on the earth’s surface bodies never fall in a vacuum. The present 
exposition is not trying to explain interest in the economy of today or any other 
particular era but rather to isolate the basic universal causes of the interest pheno- 
menon. For this purpose one studies not the real world but idealized models from 
which “‘irrelevant’’ disturbances have been deliberately eliminated. The steady state 


1973] ON THE THEORY OF INTEREST 859 


in our problem serves the same purpose as the vacuum in the problem of falling 
bodies.) 


4. Price theory for a simplified economy. We imagine an economy in which there 
are just three goods called labor, capital, and a third good on which people subsist 
called consumption. At each period of time t ‘‘nature’’ supplies a certain amount 
of labor L,. For most of the analysis we shall assume that L, changes from one period 
to the next by some constant growth factor y so that the amount of labor at time t is 
ly’, where / is the fraction of population comprising the labor force. 

Capital and consumption goods are not provided by nature but must be produced. 
The mechanism of production can be described in the following simple way. A 
productive activity & is a 4-tuple of real members, written « =(—L, —K,C, K’), 
where L and K are the inputs of labor and capital, and C and K’ are the outputs of 
consumption and capital. The reason for the minus signs on the inputs will become 
apparent shortly. The ultimate goal of the economy is to produce consumption, 
capital being important only as an intermediate good which together with labor 
makes it possible to provide greater amounts of consumption. A technology 7 con- 
sists of a set of activities, thus, a subset of 4-space. We make the following 
assumptions: 

(i) (homogeneity) if Z~e7 thenAWeT for all A =O; 

(ii) (additivity) if 7, 7,67 then.#/,+ 0,67; 

(iii) (closure) 7 is a closed set. 

These assumptions mean, (i) an activity can be operated at any level, that is, 
multiplying inputs by some constant multiplies outputs by the same constant, and 
(ii) given two activities they can both be operated at the same time. 

If o and © are activities we say & dominates WM if = oA (the symbol x = y 
means the vector x — y is nonnegative and nonzero). Thus, one activity dominates 
another if from the same or smaller inputs it can produce the same or larger outputs. 
An activity in 7 is called efficient if there is no other activity in 7 which dominates it. 
If the objective of the economy is to maximize consumption output and minimize 
labor input it is clearly never desirable to use an inefficient activity. A concrete 
example may help to illustrate these ideas. 


Example: We consider a technology in which there are two kinds of activities 
one for producing capital and the other for consumption. We suppose for simplicity 
that capital is produced by labor alone and, by suitable choice of units, we may 
suppose that one unit of labor is able to produce one unit of capital. This means that 
the capital producing activities of 7 are of the form (— L,x,0,0,L,), where Lyx is 
the amount of labor allocated to producing capital. Consumption is produced by 
capital and labor together in such a way that one unit of labor together with K units 
of capital produce K* units of consumption where 0 < a < 1. In addition, capital 
used as input to production emerges from the productive process slightly depreciated 
so that there are only wK units of capital where 0 < yp < 1. Thus a typical consumption 


860 DAVID GALE [October 


producing activity is some multiple of an activity 7, =(— 1, — K, K’, uK). Suppose 
now that at some instant of time there are L units of labor and K units of capital 
available and it has been decided to allocate L, units of labor to producing con- 
sumption. Then in order to utilize all available capital one should operate activity 
A xjp.) at level L, giving the activity 


(4.1) A, = (- L., ~~ K, L, “K*, LK). 


Assuming full employment the rest of the labor force is used to produce new capital 
using the activity 


(4.2) A, = (—(L— L,), 0,0, L— L,) 

and combining gives 

(4.3) a =(—L,— K, Li *K*, L— L, + uK). 
Since K'= L— L, + wK and C = lL’, *K“ eliminating L, gives 
(4.4) C = (L+ wK — K’)'~*K* 


which holds for all (L,K, K’) where uwK < K’ S L+ uK. This describes the set of all 
efficient activities of the model. 

It is clear from (4.4) that C is an increasing function of K and L and a decreasing 
function of K’ as it should be. This leads to a second basic notion. The activity 
A is called unsaturated if, roughly speaking, increasing any input enables one to 
produce more of all outputs. The precise definition is the following: let .°/, be the ith 
coordinate of /. Then there exists in 7 such that of, < S,and J, > A, forj #i, 
for i=1,2,3,4. In the example it is clear that any efficient activity with positive 
coordinates is unsaturated. 

We now introduce prices into the model and denote the price of a unit cflabor by w 
(wage), of input capital by p, of output capital by p’ and of consumption by qg. A 
price vector 7 is then a 4-tuple xz = (v, p,q, p’). It is crucial here that the price of 
input and output capital are not necessarily the same even though they are prices 
of the same physical good. The whole interest phenomenon, as we have seen, is 
concerned with prices which change with time and the point is that output capital 
becomes available only after input capital is applied. (It would clearly make no 
sense to have production take place instantaneously for then output capital could be 
fed back as input as soon as it was produced allowing infinitely large outputs in a 
single period.) 

The profit of activity o at prices z is simply the scalar product 


(4.5) tél = —wL— pK+qC-+ p’K’. 


The interpretation should be clear. The term wL+ pK is the cost of inputs and 
qC + p’K’ is the value of outputs and their difference is profit in its ordinary meaning. 
We now have a simple but basic result. 


1973] ON THE THEORY OF INTEREST 861 


THEOREM 1. If ©& is efficient then there is a price vector xn 20 such that & 
maximizes profits at prices x among all activities in 7. If in addition & is 
unsaturated then nx is positive. 


m Let S,, be the set of all vectors x in 4-space such that x = o&. The definition of 
efficiency says precisely that S,, does not intersect 7. Further S,, is a convex set 
and so is 7 from Assumptions (i) and (ii). It follows from the ‘‘Fundamental Theorem 
of Convexity’’ that there is a hyperplane H separating 7 and S,,. Letting z be the 
normal to H in the direction of S,, we have for any Win 7, 


(4.6) tf <1( +z) for any z=>0 


which implies 77 < nm and this is precisely the condition that is profit maximizing 
at the price z. Also x = 0, for let e; be the ith unit vector. Taking z in (4.6) to be Ae; 
we have 


(4.7) tl <A + dn, 


for all A > 0 so x; cannot be negative. 
To prove the last part of the Theorem, let ./ be the activity with o/, < &, and 


A, > Af, for j #i. Then 


4 4 
t= 1 1A, = X TA 
j=l j=i1 
SO 
(4.8) n(st,;-A%) 2 & (of; — f;). 
pti 


but since z # 0 the right-hand side of (4.8) is positive, hence so is 7;.mt 

The reader should be aware of the economic significance of this rather simple 
theorem. If we accept the second law of motion that any activity used in the operation 
of our model should be profit maximizing then this fact alone determines what 
wages to pay to labor and what the return on capital shall be, at least for the case 
when the price vector z is unique (up to a multiplicative constant). As an illustration, 
let us consider the classical issue of a man who builds a machine and then hires labor 
to operate it in order to produce consumption. The question is then how much of the 
proceeds from production should go to the laborers and how much to the machine 
owner. Theorem 1 gives a simple answer. The workers get wL and the machine owner 
pK. Once again we emphasize that this result is ‘forced’? on any economic system, 
capitalist or socialist, satisfying our second law of motion. Of course there are basic 
economic differences in the two systems when it comes to determining what happens 
to the quantity pK. In a pure capitalist economy this amount would go to increase 
the wealth of individual producers (or stock holders), while in the socialist economy 


* Observe the notational break through. The initial symbol “‘g’’ does for the word “proof” 
the same job that the final “‘g’’ has been doing for “Q. E. D.”’ all these years. 


862 DAVID GALE [October 


it would go to the state, since it is the machine owner, and the two types of economies 
will in many ways behave very differently—but not regarding the laws of price for- 
mation.* 

Let us compute the prices in our special example. Given the efficient activity »/ 
we must find prices z such that & maximizes (4.5) subject to the equation (4.4). 
Substituting from (4.4) into (4.5) we have 


(4.9) tf = —wL— pK + (uK + L— K’)'~*K*% + p’'K’, 


where we normalize prices by taking g = 1. Setting partial derivatives of (4.9) with 
respect to L, K and K’ equal to zero gives 


w = (1 —a)(K/L,) 
(4.10) P Ll — a) + « (CK /L,)] (K/L) 
p’ = (1—«)(K/L,)”, 


where we recall that L, = wK + L— K’' is the amount of labor allocated to producing 
consumption. Note that the wage is equal to the price of output capital as it should 
be from our assumption that capital is produced by labor alone. The main quantity 
with which this paper is concerned is the interest factor p = p/p’ (see previous 
section). In the example, p is given by the expression 


(4.11) p=pu+(a/(1 — «))(L,/K). 


We see that it is possible that the number p could be smaller than 1, for one can 
choose L,/K as small as one wishes. This would correspond to a negative interest rate 
or equivalently a rising real price of capital. The following sections will show why 
this situation, though possible, is unlikely to occur. 

Observe that the calculus technique used here will work whenever the set of 
efficient activities is smooth, that is, whenever there is a differentiable function @ 
such that the efficient activities are exactly those satisfying 


(4.12) o(L, K, C, K’) = 0, 


for note that prices must be such that —wL— pK + qC + p’K’ is a maximum 
subject to (4.12). Using a Lagrange multiplier A in the standard way we have 


(4.13) w= — Adz, p= —Abx, q = AG, and p' = Ady’. 


(Actually we can choose J = 1 since our technology is homogeneous so that if z is a 
price’vector so is any positive multiple of it.) We shall return to this formulation in 
Section 6. 


* A striking illustration of the “invariance” of interest with respect of the social system is the 
fact that savings accounts in mainland China pay a whopping 7% and there is no inflation so_ this 
is the real interest rate which is much higher than that of any capitalist country! 


1973] ON THE THEORY OF INTEREST 863 


5. Steady states and the qualitative theory of interest. We now suppose that popu- 
lation in our model is growing by some fixed factor y and that labor supplied in period 
t is ly’. The model is said to be in a steady state if it uses only the single activity 
in all periods, operating it at level y’ in period t. We have not yet used the first law of 
motion and will do so now. The condition that supply equals demand in this model 
means that (A) the entire labor force is employed in all periods and (B) the amount 
of capital produced as output in period ¢ is precisely the amount demanded as input 
in period t+ 1. For a steady state K,,, = yK, and hence from (B) K; = yK,. Thus 
if we denote by o =(—1, — k,c,yk) the activity used in period t =0 then y'# is 
the activity used in period t. Note that the lower case letters ], k and c represent per 
capita quantities of labor, capital and consumption, and they remain constant in 
time. 

Now for each value k of per capita capital there will exist at most one efficient 
steady state corresponding to the activity with inputs /, k and output yk, having the 
largest possible value of c. We denote this value by c,(k). The function c,(k) is central 
to our analysis and it will be instructive to calculate it for the example of the previous 
section. For a steady state, equation (4.4) becomes 


(5.1) o,(k) = (I— (y — Wk) k? = ER, 


where 1. = 1—(y — w)k. Notice that there are steady states for all k up to k,,,, 
= | /(y — w) but not greater, for a per capita capital stock above this value could not 
be maintained even if all labor was allocated to producing new capital. Differentiating 
(5.1) gives 


(5.2) c,(k) = (al /k — (y — w)(k/I)", 


and we see that c,(k) increases from 0 to a maximum as k goes from 0 to k = al /(y—y) 
and then decreases back to 0 as k runs from k to k,,,,. We are assuming of course 
that y21 and w<1. (For the special case y= =1, no population growth or 
depreciation, c, would be defined and increasing for all k.) Figure 1 gives the graph 
of c,(k) for y = 1.03, p= .9, a=4, J=1. 


1 


suboptimal region & = 1.25 superoptimal region kya, 


864 DAVID GALE [October 


The steady state with k =k is called an optimal steady state. It represents the 
economic “‘millennium,’’ the “‘golden age’’ in which the standard of living has 
reached the highest possible sustainable value. In reality it seems unlikely that any 
society has ever achieved such a state and it is not even clear that it would be desirable 
to do so. As an example, imagine a society which has achieved a very high standard 
of living. Then a social planner points out that an even higher standard would be 
possible if all the present dwellings were torn down and replaced by even better ones, 
a process that might take several decades. Under these conditions the society might 
well decide that the slight improvement in future comfort was not worth the in- 
convenience of living in tents for twenty years. 

From Figure 1 we also see that there are steady states with k > k, but these are, 
from an economic point of view, highly unreasonable. They correspond to a society 
which has gone ‘‘beyond the millennium”’ and built up such a large stock of capital 
that it is wasting labor on maintaining the stock of capital instead of using it to 
produce consumption. In the extreme case, k = k,,,,, all labor is allocated to main- 
taining the capital stock while the population starves to death! 

The ideas illustrated above can be defined in general. 


DEFINITION: A steady state of is called optimal if it maximizes c,. It is called 
sub (super) optimal if there is another steady state such that ¢,>c, and k>k 
(k < k). 

As we have noted in connection with our example, optimal and especially super- 
optimal steady states, though conceivable, are economically unrealistic. The normal 
state of an economy is in the suboptimal region. 

We can now give our fundamental qualitative result. 


THEOREM 2. If & is an optimal steady state it has an interest factor equal to y. 
If & is sub (super) optimal then its interest factor will be strictly greater (less) 
than y. 


Before proving this let us check it for our particular example. In a steady state 
we have seen that L,/K =1/k —(y — yu). Substituting this in (4.11) gives for the 
steady state interest factor 


p=ut(a/(l —a))(/k — (y — p)) 
and 
(5.3) p—y=(al/k —(y — p))/A — a) 


which is positive, zero, or negative according as k is less than, equal to, or greater 
than its optimal value k = al /(y — py). 

m Let < =(- 1, —k,c,yk) be an efficient steady state. From Theorem 1 there 
exist prices z = (w, p,q, p’) such that 


(5.4) —wl — pk + qc + p'yk = — wl — pk + gé + p'yk, 


1973] ON THE THEORY OF INTEREST 865 


where  =(— 1, —k,é,yk) is any other steady state. Recalling that p/p’ = p we get 
from (5.4) 


(5.5) q(é—c) S$ p'(p —y) (kK — k). 


If < is suboptimal then by definition we can choose . so that é>c and k > k which 
implies p > y. If is super optimal then we can choose oso that €>c andk <k 
which implies p < y. 

Finally, suppose ¢ = (— J, — k,é,yk) is optimal. For every 7 =(— 1, —k,c,k’) 
n Z consider the 3-vector B = (—I1,k’ — yk,c) and let # be the set of all such Pf. 
Note that # is convex and B = (— /,0,¢é) is in Z and by definition of optimality there 
is no B in Z with B= B. It follows by the Separating Hyperplane Theorem as in 
Theorem 1 that there is a vector 7 = (w, p’,q) such that 7B = nf for all B in Z. But 
this means 


(5.6) —wl— p'yk + qé+ p’'k = —wl — p'yk + qe + pk’ 


for all (— 1, —k,c,k’) in 7, so letting p=yp’ we see that sf has interest factor 
y. wl 

The theorem explains at least for steady states why we should expect interest 
rates to exceed growth rates, since, as we have seen, an economy will normally be 
in the suboptimal region. For an economy which is not in a steady state we do not 
even have in our model a sensible notion of interest rate since the prices of labor, 
capital and consumption may all be changing from period to period at different 
rates. Nevertheless, the general idea that (real) prices fall with time (though at 
different rates for different goods) still makes sense, and Theorem 2 gives a qualitative 
explanation of why this should occur for an economy in the suboptimal region where 
a higher future rate of consumption can be achieved only by first increasing the stock 
of capital. 


6. What does the interest rate measure? Having learned why interest rates are 
positive it is natural to go further and ask what determines their magnitude. What 
distinguishes an economy where the interest rate is high from one where it is low? 
The qualitative analysis of the preceding section suggests various possibilities. We 
have seen for example that p — y is positive for k < k and negative for k > k. It is 
natural to conjecture from this that p is a decreasing funtion of k and the magnitude 
of p — y measures in some way how close the economy is to its golden age. Notice 
that this is exactly the situation in our example as illustrated by equation (5.3). It has 
come as a rather recent shock in economic theory, however, that this plausible 
conjecture is not true. Even for our simple model one can concoct technologies in 
which the interest rate does not fall monotonically as a function of the level of steady 
state capital; so we must look elsewhere for the answer to our question. 

Now there is one further qualitative property of steady states which can be 

roved. Namely the shape of the graph of c,(k) of Figure 1 turns out to be typical. 


866 DAVID GALE [October 


THEOREM 3. The function c,(k) is strictly increasing (decreasing) if k is sub 
(super) optimal. 


m This follows at once from (5.5), for from Theorem 2, if k is suboptimal then 
p—y>0, so if k<k then é<c, and if k is super-optimal then p — y <0, so if 
k>k then é<c. m 

Now suppose at some time t we have on hand a suboptimal stock of per capita 
capital stock k. From Theorem 3 it follows that by increasing capital to k + Ak we 
can increase c, by some positive amount Ac,. The limit Ac, /Ak = dc, /dk, if it exists, 
is the slope of the graph of c,(k) and measures the rate of increase of steady state 
consumption per unit increase in capital. Of course dc, /dk cannot be the interest 
factor for it is not a pure number but depends on the units used to measure c and k. 
But now let us carry the analysis one step further. In order to obtain an increase in 
input capital k,, , next period it is necessary to increase output capital k; this period. 
Since y’k) = y'**k,414, we have kj=yk,,, so an increase of Ak,,, next period 
requires an increase of Ak; /y this period. Finally, an increase of Ak; this period 
requires that people sacrifice some consumption Ac,, that is, they must accept Ac, 
less than the steady state consumption c,(k,). The key question is then; how much 
of an increase in steady state consumption c, in the future can be obtained per unit 
sacrifice in consumption c, today? That is, we wish to measure Ac, /Ac, or rather its 
limiting value — dc, /dc,. (The minus sign occurs here because Ac, was defined as a 
sacrifice of consumption.) We shall state the theorem relating this quantity to the 
interest rate for the special case in which the technology is smooth in the sense 
described in the previous section (a somewhat more complicated result holds in the 
nonsmooth case because of the possibility that the interest factor is not unique). 


THEOREM 4. For a smooth technology the quantity (— dc,/dc,) exists and is 
equal to F =(p/y — 1). 


For the special case when y = 1 we have 7 =r, the ordinary interest rate. The 
content of the theorem can be described in economic terms in the following way. 
Given an economy in a steady state, it is possible instead of continually consuming 
all of c, to sacrifice some amount Ac. This “‘investment’’ of Ac units will provide 
(in the limit) a ‘“‘dividend”’ of 7Ac units of additional per capita consumption forever 
after. The theorem also makes sense in the super optimal case in which 7 is negative. 
There it says that we can “‘have our cake and eat it too.’’ We can get an additional 
Ac units of consumption today and still have (in the limit) — FAc extra units ever 
after. This odd situation is due to the fact once again that in the super optimal case 
we have been wasting resources maintaining too much capital and we can be better 
off by “‘eating up’’ some of this burdensome excess. 

m The proof is an exercise in implicit differentiation. We rewrite equation (4.10) as 


(6.1) PI, Ks Crs k,) = 0. 


1973] ON THE THEORY OF INTEREST 867 


Since ¢,, = p’ > 0 we have 


dk 


de, = -/Py = —4q/p’ from (4.11). 


(6.2) 


Next, as we have already observed, k, = yk,,, so dk,,,/dk/ =1/y. Finally we must 
compute dc, /dk. For this purpose (6.1) becomes 


(6.3) H(I, k, c,, yk) = 0 
and differentiating with respect to k gives 


(6.4) P, + yoy + Ode, /dk =0 or from (4.11) 
| de,/dk = (p — yp’)/q = p'(p — »)/q 


SO 
— de, |de, = — (de, /dk, 4 ,)(dk, 4, /dk,) (dk, /de,) 
= (:p'(p — y)/q)(1/y) (a /P’) = (p /y — 1). 


7. Concluding remarks. The whole of the preceding analysis has been carried out 
for a fictitious three-good world. It is natural to ask what parts of the theory carry 
over to a world such as the one we live in, in which there are thousands of goods. 
First it is important to point out those things that do not carry over. 

Our ficititious model contained a fictitious good which we called “‘capital,’’ and 
much of the theory depended on the amount of this good, the size of the ‘‘capital 
stock.’’ In particular a crucial point of the analysis was whether the capital stock was 
above or below its ‘‘optimal’’ value k. In a multi-good economy one can still talk of 
the stock of capital, this being all the factories, mines, buildings, transportation 
facilities and so on in the model, but it no longer makes sense to talk of the ‘“‘size’’ of 
the stock. That is, there is no way in general to compare two different capital stocks 
and decide which one is “‘bigger.’’ There have been various attempts by economists to 
find a suitable measure of the stock of capital but none has really succeeded and I 
doubt if the problem has a sensible solution. 

Fortunately one does not need the concept of the amount of capital in order to 
generalize the theory of interest as presented above. The thing which does carry over 
to multi-commodity models is the concept of suboptimal, optimal, and superoptimal 
steady states. The idea is very simple. A steady state is optimal if there is no other 
Steady state in which all members of the economy are better off. It is suboptimal if 
there exists a better steady state but it cannot be reached without making at least 
some members of the economy worse off during the time required to reach it. It is 
super optimal if there is a better steady state which can be reached at no sacrifice at 
all to any member of the economy. With these definitions the analogue of Theorem 2 
carrys over verbatim. The mathematics involved is considerably more complicated 


868 R. K. GUY AND J. L. SELFRIDGE [October 


(and interesting) than that used here. The interested reader is referred to references 
[1] and [2] for original versions of these results or to [3] for a self contained treat- 
ment which generalizes them. A multi-commodity version of the quantitative Theorem 
4 also exists but is as yet unpublished. 


References 


1. E. Malinvaud, Capital accumulation and efficient allocation of resources, Econometrica, 


No. 2, 21 (1953) 233-268. 
2. D. Starrett, The efficiency of competitive programs, Econometrica, No. 6, 38 (1970) 704-711. 
3. R. Rockwell and D. Gale, Malinvaud’s lemma and the theory of interest, ORC 72-14, Opera- 


tions Research Center, Univ. of California, Berkeley, (June 1972). 
4. R. M. Solow, Capital Theory and the Rate of Return, North Holland Press, Amsterdam, 


1963. 


THE NESTING AND ROOSTING HABITS OF 
THE LADDERED PARENTHESIS 


R.K. GUY, University of Calgary, Alberta, Canada, and J. L. SELFRIDGE, Northern 
Illinois University 


We refer to 


. 
a 

where there are k a’s, as a k-level expression. It is ambiguous until the order of the 

k — 1 operations has been indicated, say by the insertion of k — 2 pairs of parentheses. 

The total number of ways of parenthesizing was found by Catalan [1] to be 


— 1 (2k —-—2 
"KV k-1/) 
He used the elegant recurrence relation 


Cy = CyCy-y + C2oCp_ ng Here + Cp Cy. 


An interesting discussion of Catalan numbers appears in a paper [6] in this issue of 
the MONTHLY, which contains further references. 

We first consider those expressions in which the parentheses are nested. The 
number of such k-level expressions is 2*~?, as was pointed out in Problem E 1903 
of this MONTHLY [2]. This problem puts a = 2 and asks for the number of distinct 


1973] NESTING AND ROOSTING HABITS OF THE LADDERED PARENTHESIS 869 


values of the expressions for a given k. In this case the position of the innermost pair 
of parentheses is arbitrary, since 


(27)? _ 927) 


We complete the solution by showing that for k = 3, the 2*~> remaining values 
are all distinct. To prove this, in the evaluation each successive operation is either a 
squaring or an exponentiation base 2. We give the value, v;, of an expression, in terms 
of its second order exponent, e,, 


v; = 202°9, 
Since 
(22°42 = a2 tx 2 _ gett 
each operation is given by e,,, = e;+1 or by e,,, = 2°. If there is a coincidence 


of values between two different k-level expressions, suppose that level k(>3) is the 
lowest at which such a coincidence occurs. Since the (k — 1)-level expressions which 
gave rise to the coincidence are distinct, the equal k-level expressions have their last 
operations distinct; one an addition, the other an exponentiation. Thus e + 1 = 2/ 
where e, f are the second order exponents at level k — 1. We may write e + 1 = 29 +h, 
where 1 < h < 29, so that the last h operations were additions. At level 3 the second 
order exponent is 2,soh < k — 3.Also f= k — 2, because the second order exponent 
increases by at least 1 for each level from 3 to k — 1. So 


k-3>h=2-22>21>2 35 k -3 


and we have a contradiction. Hence all values above level 3 are distinct. The same 
method shows that for a > 2 the 2"~? expressions all have distinct values. 

We next ask how many k-level expressions there are if the k — 2 pairs of parenthe- 
ses are not necessarily nested. For k = 4 this number is strictly less than c, since 


(a°\ and (a)? 


are equal, both having second order exponent a+1. This shows that exponentiation 
is not completely non-associative. A further problem is to count the distinct values 
of the k-level expressions for a particular value of a. The answers will be the same 
for all a such that there is no coincidence of value. We shall see that if there is a 
coincidence of value between a k,-level expression and a k,-level expression, then a 
coincidence occurs at all levels from k, + k, upwards. We assume a chosen so that 
no such coincidence occurs. Such a choice is possible since only a countable number 
are excluded. An outline of a proof of this is given by Gébel and Nederpelt [3]. 
We again work with second order exponents; now 


*t (a° sy "iva j eitael 
(a " = gi 0D = gt! 


870 R. K. GUY AND J. L. SELFRIDGE [October 


riv tfyvy 


YYW Uy 
isaiacaaane 
UU yy: 


Fic. 1 


1973] NESTING AND ROOSTING HABITS OF THE LADDERED PARENTHESIS 871 


so the second order exponents for level k are found by elementwise addition, for 
each i, of the pair of sets {e;}, {a°’} where i takes the values 1()k-—1,i+j =k 
and e; is a typical second order exponent for level i. 


The sets {e,} for k = 1(1)6 are: 


k {e,,} 

1 |0 

2/1 

3 | a; 2 

4 | a‘,a7; a+1; 3 

5 | a™ a”, att} aes a7 + 1a? +1;2a;a4+2;4 

6 a”, a”, av a”, gett gett ata att? at: ae 4 Lav+ Lattt+ 1,a3 +1: 


a? +a,a7+a;a°+2,a7+2;2a+1;a+ 3:5. 


On comparing the sequence of cardinalities of these sets with a prepublication 
version of N. J. A. Sloane’s handy table [7], we learned what we should have guessed, 
that anything which nests is often associated with trees. In fact | {e,} | = r,, the 
number of non-isomorphic rooted, but otherwise unlabelled trees with k vertices. 
Knowing this, it is not difficult to see the correspondence between such trees and the 
sets as they are generated above. Exponentiation base a corresponds to growth, 
planting or grafting; addition corresponds to branching. Figure 1 shows all rooted 
trees with k vertices, k = 1(1)6. The parentheses are all nested except where the 
second order exponents are 2a at level 5 and a**, a* + a, a* + aand 2a + 1 at level 6. 
Methods of enumerating rooted trees are well known [4, 5]. The numbers may be 
calculated from the recurrence formula | 


r= »y m(“e rm), 
mk-1 i 


where r, is the number of rooted trees with k vertices, the sum is taken over all 
partitions m(k — 1) of k—1= Dim; into m,(=0) parts of size i(= 1), and the 
binomial coefficient is the number of ways that m; rooted trees, each with i vertices, 
chosen from the r; possibilities with repetitions allowed, can be attached by m, edges 
to a root to form a rooted tree with k vertices. The numbers for k = 1(1)12 are: 


kK 123 45 6 7 8 9 10 = l1 12 
r, 1 12 4 9 20 48 115 286 719 1842 4766. 


872 R. K. GUY AND J. L. SELFRIDGE [October 


In [5] the table is extended to k = 26. 

To find the number of distinct values of the r, expressions, when a takes a par- 
ticular numerical value, is a more complicated problem. In the trivial cases a = 1 
(or — 1), only the value 1 (or — 1) occurs at each level. If as usual 0° = 1, then 
for a = 0 the values are 0,1 for k = 1,2 and both O and 1 for k = 3. We defer 
consideration of a = 2, which initiated our discussion, since it exhibits a special 
feature. We deal with a = 3, which will also serve as a model for larger integer 


values. 


For k=1,--+ , 6, the numerical values of the second order exponents, when a=3, are 
k {e, | 
1 0 
2 1 
3 3;2 
4 27, 9; 4; 3 
5 327, 39, 81, 27; 28, 10; 6; 5; 4; 
6 | 3327, 339, 381, 327, 328 310 799 243, 81; 327 + 1, 39 + 1, 82, 28; 30, 12; 29, 11; 7; 6; 5. 


The semi-colons in this table and in the earlier one separate the contributions 
from the various partitions of k — 1 in the formula for r,. So far the 1, 1, 2, 4, 9, 20 
values are distinct at any one level, but the value 3 occurs at both levels 3 and 4; 
27 and 4 occur at levels 4 and 5; and 3”’, 81, 28, 6 and 5 occur at levels 5 and 6. 
Note that the corresponding trees are those marked a and 3; a’, a+ 1 and a°,4; 
av ,a***a%+1,2a,a+2 and a”, a*t,a>+1,a+3,5. They each arise from 
replacing the (sub)tree a with 3 vertices by the (sub)tree 3 with 4 vertices. Coincidences 
in value at the same level will occur whenever we have a tree containing tree a and’ 
tree 3 as disjoint subtrees, which yields a different tree when these two subtrees are 
interchanged. More generally, for any integer a = 3, the first coincidence in value, 
and the unique one at that level, occurs at level a + 4, the trees being those in Figure 2 


A A 
oy A A “\/ 
A A A 
a +1 


avtitit+-: qitit« stl qv a? +a a’ +a at+atl 


Fic. 2 Fic. 3 


They are obtained by grafting trees a and 1+1+---+1 (=), in either order, 
onto the two vertices of tree 1. To find all the coincidences at level a + 5 (ie. level 
8 if a = 3), we graft trees a and 1+1+---+ 1 in every possible way onto two 
inequivalent vertices of each rooted tree with 3 vertices. Figure 3 exhibits the 4 ways 


1973] NESTING AND ROOSTING HABITS OF THE LADDERED PARENTHESIS 873 


with pairs of vertices labelled A, A. At level a + 6 there are 16 coincidences, illustrated 
in Figure 4 and marked with the values of a = 3. More generally, the number of 
coincidences at level k would be the number of rooted trees with k — a — 2 vertices, 
with 2 inequivalent ones having indistinguishable labels. However, there are two 
further complications. The first is exhibited at level 10 for a = 3. If we start from 
the a-tree (Figure 5) and graft on a,a,3 at its 3 vertices, we obtain three trees, each 
of value 3°° +3: there are 2 duplicates, where we would be counting 3. We must 
use the inclusion-exclusion principle and make allowance for the number of rooted 
trees with indistinguishable labels on 3 inequivalent vertices. For level 11 and a = 3 
this amounts to 10 cases (Figure 6), the tenth arising from grafting a, 3, 3 onto the 
a-tree. The second complication is that new coincidences arise wherever a new power 


A A A 


Tana 


43 432743 9377 4 3 3843814 3732 33! 32843 246 374] 3774 4 374 27 85 108 
Fic. 4. 


“ 
TY 
A 
aaron yg 331 +3 339 4. 30 
3 


399 4 3 3443 39944 7430 39043 


Fic. 5. Fic. 6 


of a occurs. For a = 3 this next happens at level 11 from the equality of a at level 
4 with a + a+ a at level 7 (Figure 7). Grafting these in either order onto the vertices 
of the 1-tree gives 2 non-isomorphic trees, each with 11 vertices and value 3? + 9. 


More generally, this first occurs at level 2a + 5. 


Y¥ 


Fic. 7 


874 R. K. GUY AND J. L. SELFRIDGE [October 


For larger values of a, these events occur at correspondingly higher levels, so we 
are able to list the number of distinct values for k = 1(1)11 and a 2 3. 


k 1 2 3 4 5 6 7 8 9 10 11 
a= 1 1 2 4 9 20 47 111 270 664 1659 
a=4 1 1 2 4 9 20 48 114 282 703 1787 
a= 1 1 2 4 9 20 48 115 285 715 1826 
a= 1 1 2 4 9 20 48 115 286 718 1838 
a= 1 1 2 4 9 20 48 115 286 719 1841 

rk 1 1 2 4 9 20 48 115 286 719 1842 


For k < a+ 3, this number is the same as r,. Fork = a+4,a+5,a-+6, it is 
r, — 1, r, —4 and r, — 16. Thereafter the extra complications have to be taken into 
account. A more powerful enumeration could be made by an application of the 
Redfield-Pélya theorem, but technical difficulties will still arise. 

We can answer the converse question: at what levels and with what frequencies 
does a particular value occur? Partition the value into parts which are powers of a; 
similarly partition all exponents. Do this in every possible way. For example, if 
a = 3 then 28 can be expressed in 24 ways as 


33 41 = Bititi +1 = Ziti Ziti gitia | 
= 31t14 314143434341 
= 34414 3447 43434+14+14+14+1=-: 


so that 28 occurs (as a second order exponent) just at levels 5, 6, 11, 14-17, 17-23 
and 20-29, i.e., it is duplicated at levels 17 and 20 through 23. 

Finally we consider a = 2. Here there is an immediate coincidence at level 3, 
as we noted at the outset. In Figure 1, the a-tree and the 2-tree, have the same value. 
So we eliminate the former, and ‘prune’ all rooted trees, in the sense that wherever 
the a-tree appears, we replace it by the 2-tree. Such trees were called ‘trimmed’ by 
Gobel and Nederpelt [3]. As they pointed out, pruned trees can be enumerated by 
the same recurrence as for r,, except that as we have replaced all a-trees by (1 + 1)- 
trees, we have no contribution to any partition which contains a part of size 2. 
The corresponding numbers, s, ,of pruned trees with k vertices, are: 


k 1 2 3 4 5 6 7 8 9 10 11 12 13 


Sk 1 (1) 1 2 4 8 17 36 79 175 395 899 2074 


The parentheses mean that s, should be taken as zero in applying the recurrence 
relation. 


1973] NESTING AND ROOSTING HABITS OF THE LADDERED PARENTHESIS 875 


For a = 2, the first few values of the second order exponents are: 


{ex 


3 

16, 8: 5:4 

65536, 256, 32, 16; 17,9; 634 

265536, 2256, 232, 65536, 131072, 512, 64, 32; 65537, 257, 33, 17; 18, 10; 8; 7; 6. 


~aeeeen | = 
NFO 


The first coincidence is 4, at levels 4 and 5, so the first coincidence at the same 
level (above level 3) is 2* + 4 = 20 at level 9 (see Figure 8). Complications of the 
first kind occur first at level 4 + 4 + 5 = 13, and of the second kind at level 5 + 7 = 12 
from a> = 2a? (Figure 9). Note that in using Figure 4 to count duplicates at level 11 
we ignore the 6th and 15th trees, since even after grafting they would contain an 
a-tree. But at this level there are two duplicates of the second kind, since (see Fig- 
ure 10), 


a2 
q?@+l — 294” and a? = 2a”. 
+4= 20 a® ‘al : yt 
Fic. 8 Fic. 9 Fic. 10 


This gives the following numbers of distinct values of k-level expressions with a = 2. 


a=2 1 1 1 2 4 8 17 36 78 171 379. 


There seems to be no simple characterization of what we might call exponential 
numbers, which lead to coincidences of value of k-level expressions. The coincidence 
may be between different tevels in the first instance, but this will induce coincidences 
at the same level for all sufficiently large k, and the number of distinct values will be 
less than r, for such k. The exponential numbers include all algebraic numbers, but 
do not form a field. 


876 THOMAS WILLMORE 


We list the numbers of distinct values of k-level expressions for the algebraic 
numbers 4 (1 + J 5) and J 2 and for the transcendental positive root of a* = 2. 


k 1 2 3 4 5 6 7 8 9 
azw#=a-+]1 1 1 2 3 7 15 35 81 195 
a2 = 2 1 1 2 4 8 17 38 89 208 
at =2 1 1 2 4 8 17 39 90 213 


We wish to thank John Riordan and the referee for suggestions. 


R.K. Guy’s research supported by dwindling grant A-4011 of the National Research Council of 
Canada. 


References 


1. E. Catalan, Note sur une équation aux différences finies, J. Math. Pures Appl., (1) 3 (1838) 


508-516. 

2. G. Eldredge, Nesting habits of the laddered parenthesis, Problem E1903, this MONTHLY, 73, 
(1966) 666; M. Goldberg, incomplete solution, ibid., 77 (1970) 525-526; E. F. Schmeichel, comment 
ibid. 78 (1971) 298; completion of solution, ibid. 79 (1972) 395-396. 

3. F. Gébel and P. R. Nederpelt, The number of numerical outcomes of iterated powers, this 
MOonrtTHLY, 78 (1971) 1097-1103. 

4. F. Harary, Graph Theory, Addison-Wesley, Reading, Mass., 1969, 187-190. 

5. J. Riordan, An introduction to combinatorial analysis, Wiley, New York, 1958, 125-139. 

6. —-—_-—, A note on Catalan parentheses, this MONTHLY, 80 (1973) 904-906. 

7. N. J. A. Sloane, A Handbook of Integer Sequences, Academic Press, New York, 1973. 


CORRECTION TO “THE MATHEMATICAL SOCIETIES AND ASSOCIATIONS 
IN THE UNITED KINGDOM” 


THOMAS WILLMORE, University of Durham, England 


In this MONTHLY 79 (1972) 985-989, I stated that reviews of new mathematical 
books appear in the Journal of the London Mathematical Society. This used to be 
the case, but the London Mathematical Society now produces a very good journal, 
the Bulletin, which contains interesting information, lengthy expository articles and 
also the book reviews which previously would have appeared in the Journal. 

I omitted all reference to the Edinburgh Mathematical Society, a Mathematical 
Society of long standing, which, although primarily concerned with mathematical 
research, has also had considerable influence on mathematics teaching. This justly 
provoked criticism from its President, Professor W. D. Collins, who incidentally 
extends a warm invitation to all members of the Mathematical Association of America 
to attend meetings of the Edinburgh Mathematical Society if they are able to do so. 
At least one Englishman will no longer identify “England” and “‘United Kingdom’’ 
in the future!!! 


SQUARING RECTANGLES AND SQUARES 


N. D. KAZARINOFF, State University of New York at Buffalo, and 
ROGER WEITZENKAMP, The University of Michigan 


1. Introduction. A squared rectangle is a closed rectangular region subdivided 
into a finite number of square regions that intersect only at their boundaries. The 
order of a squared rectangle is the number of its component squares. A squaring (of 
a rectangle) is perfect if no two component squares are congruent; otherwise it is 
imperfect. A simple squared rectangle properly contains no squared rectangle of 
order more than one. All other squared rectangles are compound. 

Perfect squared rectangles of low order are easy to find, once one knows how to 
generate them. Perfect squared squares of low order are exceedingly rare at best. 
The perfect squared square of least order known has order 24 and is compound. It was 
found by T. H. Willcocks in 1948 [23,24]. In 1965, W.T. Tutte [21] reported in this 
MOonrTHLY that A. J. W. Duijvestijn [9| had shown no perfect squared squares of order 
less than 20 exist. But, in fact, Duijvestijn only resolved the problem of determining 
all simple squared rectangles of order less than 20. We have recently [12] proved 
that there does not exist a compound perfect squared square of order less than 22. 
Thus there exists no perfect squared square of order less than 20, and Tutte’s 
generous restatement of Duijvestijn’s result is true. 

Study of squarings of rectangles involves some graph theory, topology, combina- 
torics, number theory, and computer programming, which makes it an attractive 
subject. In this article we introduce the reader to the theory of squared rectangles, 
and we give an account of both recent and past results. The prerequisites we require 
are an elementary knowledge of topology, of how to solve a system of simultaneous 
linear equations, and of Kirchhoff’s Laws. For an exposition less technical than 
ours, we refer the reader to an article by Tutte [19]. 

Although the study of squaring rectangles is old (see Section 6 for an historical 
account) a mathematical theory for squared rectangles is much younger. In 1940 


N. D. Kazarinoff did his University of Wisconsin Ph. D. under R. E. Langer. He held positions 
at Purdue University and the University of Michigan before joining SUNY at Buffalo as Chairman 
of the Mathematics Department, and now also Martin Professor of Mathematics. He spent a year 
leave at the University of Wisconsin, was an exchange professor at the Steklov Institute of Mathe- 
matics, Moscow in 1960-61, and again in the spring semester of 1965. 

He served as managing editor of the Michigan Mathematical Journal, as the consulting editor of 
Mathematical Reviews, as Chairman of the MAA Putnam Examination Committee, and on numer- 
ous MAA and AMS Committees. He is an elected member of CBMS, and in 1968 he received an 
award for Distinguished Undergraduate Teaching from the University of Michigan. His main research 
is in differential equations, and he is the author of Geometric Inequalities (1961), Analytic Inequalities 
(1961), and Ruler and the Round (1970). 


Roger Weitzenkamp did his undergraduate and master’s degrees at the University of Nebraska. 
He is a graduate student at the University of Michigan, and his main interest is combinatorics. 
Editor. 


877 


878 N. D. KAZARINOFF AND ROGER WEITZENKAMP [October 


Brooks, Smith, Stone, and Tutte [6] constructed an elegant, and indeed, definitive 
theory of squared rectangles. They related squaring rectangles to determining current 
distributions in certain electrical networks (planar graphs). These networks are 
composed of wires of one ohm resistance, except for one wire that contains a battery 
whose potential produces the current distribution. The central results of Brooks, Smith, 
Stone, and Tutte are: (1) there exists a one-to-one correspondence between squared 
rectangles and certain equivalence classes of planar graphs, and (2) each simple 
perfect squared rectangle corresponds to an electrical network of unit resistances 
and battery that is equivalent to a 3-connected planar graph. (See Section 3 for 
definitions of the terms 3-connected and planar graph.) Our article is based on the 
theory of Brooks et al. 

Somewhat after 1940, Tutte found, and later published [20], an algorithm for 
creating a complete list of 3-connected, finite, planar graphs. Duijvestijn [9] used 
Tutte’s algorithm to search with an electronic computer for simple perfect squared 
squares by inductively creating a first portion of the list of all 3-connected planar 
graphs, an induction beginning with the 3-connected planar graphs of six and eight 
edges (Fig. 1). 


la 
Fic. 1. 


The theory of Brooks et al, Tutte’s theorem [20], and our extension of it (Theorems 
2 and 4 below) provide an algorithm for generating all perfect squared rectangles — 
simple and compound. But almost nothing is known of the obvious problem: given 
a closed rectangular region how can it be subdivided to yield a perfect squared 
rectangle? Max Dehn [8| proved that a rectangle can be squared if and only if its 
sides are commensurable, and first R. Sprague [16] and, independently, Brooks 
etal proved that each rectangle with commensurable sides can be squared perfectly 
in infinitely many totally distinct ways. But no one has found any algorithm for 
determining the perfect squaring of least order or even a reasonable estimate of that 
minimal order. For example, I. M. Yaglom [25] has shown that an a by b rectangle 
(a/b rational) always can be subdivided to yield a perfect squared rectangle of 
order at most 13a*b*-11ab-1, which for a 32 by 33 rectangle yields the upper bound 


1973] SQUARING RECTANGLES AND SQUARES 879 


14,485,151. But in actuality the perfect squaring of a 32 by 33 rectangle of minimal 
order has order 9. 


2. Generation of squared rectangles using Kirchhoff’s Laws. The following example 
illustrates the method of Brooks et al for constructing squared rectangles from 
planar graphs. Tutte [19] has also written an elementary account of the relationship 
between planar graphs and squared rectangles. We consider the electrical network 


S* illustrated in Fig. 2. 


Fic. 2. 


Suppose each resistance R, is 1 ohm and that the current i, in the resistance R, is 
positive if it flows in the direction indicated and negative if it does not. Kirchhoff’s 
Laws applied to a network are expressed as mesh equations (the change in potential 
around any closed path in the network is zero) and vertex equations (the flow of 
current into a vertex equals the flow out). For the illustrated network the vertex 


equations are: 


ly + l3 = Ls + i- 
ly — 13 + lg 
ig + Ls — le. 


The mesh equations are: 


880 N. D. KAZARINOFF AND ROGER WEITZENKAMP [October 


These six equations are solvable for the seven currents (i, ---,i7) up to a constant 
factor of the unknowns, which we may choose so as to obtain a least solution in 
integers. This solution is (4,3,1,2,1,3,4). We now imagine that in the network St each 
wire containing a resistance corresponds to a rectangle of width equal to the absolute 
value of current in the wire and height equal to the absolute value of the drop in 
potential over the wire. Since each R, equals 1 ohm, the drop in potential numerically 
equals the current; and the associated rectangle is a square. The imperfect squared 
rectangle that is thus obtained from the network S* of Fig. 2 is illustrated in Fig. 3. 


Fic. 3 


Note that each horizontal line segment in this figure corresponds to a vertex of 
St and that each square corresponds to an edge of S*. Given any planar electrical 
network of 1 ohm resistances and a battery, a squared rectangle can be derived from 
it in this way. 

Let us vary this example. Suppose now R, = b and R,=R,=::-=R, = 1, 
that is, suppose that not all resistances in S* are 1 ohm. This adds one unknown, 
namely b, to the system of equations we obtain from S+ and changes the first mesh 
equation to 

b-i, —1-i, -—1-i, =0. 
To solve for (i,,-:-,i7, b) we need another equation. To provide it we add a ‘“‘fixed 
ratio’’ condition, one that introduces the width-to-length ratio c of the ‘‘rectangled’’ 
rectangle to be derived from S*. This condition is: 

c(b:i, + 1-i,) =i, + i,. 
For each c, the seven equations in the eight unknowns are again solvable up to a 
constant factor of the currents. For c = 1, a solution is (8, 4, 1,3,2,5,7,5/8). Again 
we imagine that in S* each wire containing a resistance corresponds to a rectangle. 
Only this time one rectangle, the one corresponding to R,, is not a square. Because 
of the choice c = 1, the ‘‘rectangled’’ rectangle corresponding to S* is a square; 
see Fig. 4. 


1973] SQUARING RECTANGLES AND SQUARES 881 


Fic. 4 


The point of this variation of the example is that if we knew a squaring of a 5 by 8 
rectangle, then it would yield a compound squaring of a square. We shall seek to 
make clear a process of exhaustively producing such compound squarings of squares. 

There is, however, a difficulty arising from the first example. The resulting 
squared rectangle is compound, but the graph S* is 3-connected. We should like 
simple squared rectangles to correspond to 3-connected planar graphs, and compound 
squared rectangles to correspond to 2-connected planar graphs. A more careful 
scrutiny in the next section will allow us to make such a classification. 


3. Graphs corresponding to perfect squared rectangles. A finite planar graph is a 
finite, planar collection of points called vertices and closed connected arcs called edges, 
together with a correspondence associating edges with vertices, namely, the vertices 
are the endpoints of the appropriate edges. A loop is an edge whose endpoints 
coincide. The order of a graph is the number of its edges. Throughout this article 
we deal only with connected, finite, planar graphs of positive order which we shall 
simply call nets, and we consider them as point sets. 

A simple closed curve contained in a net that either contains no edges in its 
interior or all edges in the closure of its interior is a mesh of that net. If two vertices 
on the same mesh of a net are designated as poles, the net is a polar net. If A and B 
are polar nets, A < B, and A meets B— A only at the poles of A, then A is a polar 
subnet of B. If S is a polar net with poles V and W, the completion S* of S is the 
net formed by joining the poles of S with one additional edge joining v and w. 

Let S be a net. If there exists a vertex v of S such that S—v is not connected, then 
S is 1-connected. If S is not 1-connected, H and K are subsets of S, each containing 
at least two edges, and v and w are vertices of S, such that S=HUK and HOOK 
=v Uw, then S is 2-connected. If S is neither 1-connected nor 2-connected, then S 
is 3-connected. 

Our first theorem provides the means for defining a polar net corresponding to a 
squared rectangle. This theorem is a rewording of a theorem formulated by Tutte 
[17, $2.2] for triangles. 


882 N. D. KAZARINOFF AND ROGER WEITZENKAMP [October 


THEOREM 1. For each squaring of a rectangle R with component squares S,, 
there exists an orientation of the rectangle and a set of closed line segments 
pS (go =h,v;i=1,2,-:-,m,), where m, and m, are positive integers, such that: 

(a) The.union of the p; is the union of the sides of the component squares S,, 
each side of each S, being contained in some pj. 

(b) Each pj; is horizontal or vertical as o =h or o =v. 

(c) Two distinct segments have at most one point in common. 

(d) If w is a vertex of a square S, but not a vertex of R, then w is an interior 
point of just one of the segments p;. If such a vertex w is common to four of the 
squares S,, then w is an interior point of some p;. 


Given a squared rectangle R, let P = P(R) denote that polar net whose vertices 
correspond to the segments p} and whose edges correspond to the squares S ; Uf we 
consider P as an electrical network with unit resistance in each edge, a voltage 
applied to the poles of P induces currents in the edges which are proportional to the 
sides of the squares they represent. Brooks et al | 6, p. 324| show that if R is simple, 
then Pt is 3-connected, while Theorem 3 (below) shows that if R is compound, then 
P+ is 2-connected. Since R can also be determined from P*, we have the desired 
correspondence between nets and rectangles which was mentioned at the end of the 
last section. Part (d) of the conclusion of Theorem 1 plays a key réle in this 
correspondence. 


DEFINITION. For n 2 5, let &%,, denote the set of all finite planar graphs S such 
that: 

(a) Sis a polar net of order n. 

(b) S* is 2-connected or 3-connected. 

(c) No two edges of S have the same pair of endpoints. 

(d) Each vertex of S that is not a pole is an endpoint of at least 3 edges. 

It is easy to construct a family of nets which shows that &, is nonempty for each 
n= 5. 


LEMMA. Let R be a perfect squared rectangle of order n. Then P = P(R)é &,,. 
This lemma shows that the class #, was well chosen. 


Proof of the Lemma. The net P is polar by construction, its poles corresponding 
to the p; at the top and bottom of R. It has more than 5 edges because a perfect 
squared rectangle must contain at least nine component squares [6, p. 324]. To 
establish the second property of membership in &, it is sufficient to show that for 
any vertex v of P* there exists a circuit containing v and the poles of P. Such a 
circuit may be found by tracing a path from v to each pole via the corresponding 
squares in R, and including the edge P* — P. Finally, if either of the last two properties 
of membership in %,, were violated by P, then two edges of P would carry the same 
(nonzero) current in the electrical model of P. This is not possible because R is perfect. 

The set #,, can be broken conveniently into four parts by the following theorem. 


1973] SQUARING RECTANGLES AND SQUARES 883 


THEOREM 2. Each element S of &,, satisfies exactly one of the following: 

(1) S* is 3-connected. 

(2) S=X Ux, where XE L,,_, and x is an edge added to a pole p of X in such 
a way that x connects p to one pole of S and X (\\x = p. The second pole of X is 
the second pole of S. 

(3) S=YUy, where Ye Y,_,, the poles of Y are the poles of S, and y is an 
edge joining the poles of Y. 

(4) There exist integers m and k with m,k 25 and polar nets A in #,, and B 
in £, such that A* is 3-connected, and S* is formed by joining A and B at 
their poles. ; 


We omit the proof of this theorem. It is long and somewhat tedious, involving 
counting and connectivity arguments. 
The following theorem is implicit in [6, p. 323]. 


THEOREM 3. Let R be a compound perfect squared rectangle, and let P = P(R). 
Then P+ is 2-connected. 


Proof. By the lemma, Pe #, for some n. Therefore P* is 2-connected or 3- 
connected. Since R is compound, it properly contains a perfect squared subrectangle 
R,. Let P, = P(R,). From Theorem 1(d), we conclude that a vertex of P, that is not 
a pole of P, is incident only with edges corresponding to squares of R,. Thus P, is a 
polar subnet of P, and Pt is 2-connected. 

We shall calla net Sin %, a T; net (i = 1,2,3,4) if S satisfies the ith conclusion of 
Theorem 2. To discover all compound perfect squarings of rectangles one need not 
consider T, nets because of the above theorem. Also, perfect rectangles corresponding 
to T, and T; nets consist of a square adjoined to one side of a smaller perfect squared 
rectangle, so that they are easy to find inductively. We are left with T, nets. 


THEOREM 4. If Q is a compound perfect squared square of order n, then P = P(Q) 
is a T, net of order n. 


4. Gnomons. Defining a gnomon as the completion of a 7, net, we know that to 
determine all compound perfect squared rectangles it is sufficient to create a hierarchal 
list of gnomons. Theorem 2 provides the means for doing this. Conclusion (4) of 
Theorem 2 describes the compound structure of gnomons, and all the conclusions 
describe the basic parts of gnomons. In this section we present an algorithm for 
creating a complete, hierarchal list of gnomons. 

It is possible. with forethought to eliminate certain portions of this list from 
consideration. For example if C is a polar net that corresponds to an imperfect 
squared rectangle, no gnomon G containing C can yield a perfect squared rectangle 
so long as the “‘battery’’ edge is an edge of G—C. After eliminating as many 
gnomons from the list as we easily could through mathematical analysis, we generated 
the remainder of the gnomons in the list having 22 or fewer edges by IBM-360 
computer and dissected them one by one, also by computer. Over 17,000 gnomons 


884 N. D. KAZARINOFF AND ROGER WEITZENKAMP [October 


were dissected by the computer. The program we used to perform the dissections is a 
modification of Duijvestijn’s program [9] that was written by James Reeds III. 

We emphasize that when we dissected a gnomon of order n, we did so in all pos- 
sible ways; that is, we solved the electrical networks determined by placing a battery 
in turn in each edge of the gnomon and unit resistances in each of the other n — 1 
edges. The n squared rectangles resulting from these n dissections may be all different 
or there may be several alike. Each may be perfect or imperfect. The final result of 
this analysis of gnomons was the following theorem. 


THEOREM 5. There exists no compound perfect squared square of order 21 or less. 


We describe one method of constructing gnomons. Consider a T, net S = So in 
f,, and let A,B,m, and k be as in conclusion (4) of Theorem 2. Since B belongs to 
Lf, itis a T; net for some i. If Bisa T, or a T; net, remove the edge that corresponds 
to the edge x or y of Theorem 2 to obtain a net B, in #,_,. Repeat the procedure, if 
possible, to obtain By =BeY,, Bye L,_1,::°,Bj;eL,_;,--+. The procedure 
terminates at some S, = B,, where S, is either a T, or a T, net. Then the gnomon S+ 
can be realized as the union of S,, a T, net Ag = A, and i, = | edges of the types of x 
and y in Theorem 2. If S, is a T, net, no further decompositions are needed. Other- 
wise, S, isa T, net, and S, is decomposed in the way S was decomposed. Repeat the 
procedure for each S, (j = 1,2,---) until a T, net, say S,, is reached. At this final 
stage the original T,, net S is described by: 

(1) a sequence of T, nets Ao,---,A,_,, A, = S,, 

(2) a sequence of T, nets So,--:, S,_,, and 

(3) g =i, +--- +i, edges of the types of x and y in Theorem 2, 
in such a way that for each j <r, the gnomon S; is the union of A,, S;,,, and i,;,, 
edges of the types of x and y; see Fig. 5 for an example. 


Fic. 5 


By reversing the steps in the above procedure we reconstruct S* from the A,’s 
and the extra edges. Indeed, beginning with arbitrary T, nets {A,;} and as many 
extra edges as needed, we (theoretically) can construct all gnomons of a given order 
mM. 


1973] SQUARING RECTANGLES AND SQUARES 885 


Another procedure for searching for compound squared squares is illustrated by 
the second example in Section 2. One can substitute an unknown resistance in one 
or more wires in the electrical analogue of a net and solve that network by Kirchhoff’s 
laws subject to the constraint that the resultant squared rectangle be a square. 
This method depends upon a knowledge of squared rectangles and, ultimately, simple 
perfect squared rectangles. In the example of Section 2 knowledge of a 5 by 8 rectangle 
is required. If there were one of order 17 or less, then it would yield a compound 
squared square of order less than 24, which could perhaps be perfect. (No such 5 by 8 
rectangle does exist.) This method of search has been used by P. J. Federico [10]. 


5. 3-connected nets and simple squared rectangles. The fundamental data for 
generating compound squarings of rectangles are the 3-connected nets whose dis- 
sections yield simple squared rectangles. The basic theorem for generating 3-con- 
nected nets is Tutte’s [20]. 


THEOREM 6. (Tutte). Let G be a 3-connected planar graph with no loops, at 
least 4 vertices and such that no two edges have the same end points. Suppose further 
that G is not a wheel. Then either G or its dual graph can be derived from a simple 
3-connected planar graph H by adjoining a new edge e to H whose ends are vertices 
of the same mesh of H and are not joined by an edge of H. 


Duijvestijn used this theorem to write a program for computer. Using the comput- 
er, he generated the 3-connected nets of orders less than 21 and showed there exists 
no simple perfect squared square of order less than 20. 

We repeated some of his work. By computer we found and printed all dissections 
of rectangles corresponding to 3-connected nets with 17 or fewer edges. We also 
generated all such nets of orders 18 and 19, dissected them, and printed perfect 
dissections yielding ratios p/q with p+ q < 300. (Duijvestijn [9] counts eight 3- 
connected nets of 12 edges. We found nine. All other counts agree.) We present 
some statistics arising trom these data in Table I. 

In Table IJ we give the Bouwkamp codes of the simple perfect squared rectangles 
of orders 16,17 and 18 having sides with reduced ratios p/q such that p + q < 30. 
The Bouwkamp code of a 32 by 33 simple perfect squared rectangle of order 9, the 
least order possible, is: (18,15) (7,8) (14,4) (10,1) (9). The edge lengths of squares 
whose upper edges are segments of the same horizontal dissector are grouped in 
parentheses; the groups are listed in order of decreasing levels of the horizontal 
dissectors. These levels correspond to the levels of the potential in the corresponding 
electrical network. 

In all perfect squared rectangles of smallorder (9 or 10 or 11) the largest subsquare 
appears in a corner. This phenomenon tends to persist, although occasionally the 
largest square appears at the “‘middle’’ of one of the sides. An example of a simple 
perfect squared rectangle of order 22 in which the largest subsquare appears in the 
‘“‘center’’ of the dissection is: (419,366,174,156,255) (18,138) (192) (39,216) (177) 
(53,505) (472) (393) (7,386) (133,379) (359,113) (246). This is a 1370 by 1250 rectangle. 


886 N. D. KAZARINOFF AND ROGER WEITZENKAMP [October 


TABLE I 
% of nets yielding at least one squared rectangle with sides having 
reduced ratio p/g such that 


No. of| Total No. of 3-| p+q¢<300 | p+q<30{] p+q<30 |[porqg<40 piq=1 pilq<1/2 
edges | connected nets} (perfect) | (perfect) | (imperfect) | (perfect) | (imperfect) | (perfect) 


10 2 50 0 100 50 0 0 
11 2 100 0 100 0 0 0 
12 9 22.2 0 66.7 0 11.1 0 
13 11 54.5 0 63.6 9.1 9.1 0 
14 37 32.4 0 48.6 0 10.8 2.7 
15 79 22.8 0 35.4 0 3.8 3.8 
16 249 14.1 0 20.5 1.2 4.0 6.8 
17 671 14.0 15 18.2 2.4 1.8 6.9 
18 2182 11.5 23 — — 2.1 — 
19 6692 8.7 09 — —_ j — 
20 12,123* 5.5* 1* — —_ .8* — 
21 5,998 * 3.0* 1* — —_ 5* — 
22 4,949* 2.2* 04% — — 3" — 


* only the number of nets sampled and percentages thereof 


TABLE II 


Bouwkamp code 


16 | 14/15 | (87, 95) (39, 48) (40, 55) (27, 12) (3, 60, 25) (15) (10, 45) (42) (35) 

16 | 11/18 | (70, 73) (67, 3) (76) (39, 9, 7, 12) (2, 5) (11) (8, 85) (19) (58) 

17 | 13/14 | (51, 30, 88) (13, 17) (8, 5) (1, 16) (6) (56, 3) (9) (25) (19, 94) (75) 

17 5/7 | (17, 19, 27, 21) (15, 2) (13, 8) (5, 16) (1, 4) (33, 3) (7) (28) (23) 

17 3/5 | (40, 29) (13, 16) (36, 4) (2, 8, 3) (© (19) (14) (12, 21) (39, 9) (30) 

17 | 14/15 | (145, 93) (45, 48) (42, 3) (23, 28) (110, 35) (18, 5) (33) (75,2) (20) (53) 

17 | 11/15 | (67, 52, 46) (6, 19, 21) (28, 30) (17, 2) (54, 13) (23) (41) (39, 8) G1) 

18 9/10 | (163, 98) (44, 54) (21, 23) (13, 41) (127, 57) (36) (8, 33) (19, 25) (70, 6) (64) 
18 | 13/14 | (123, 150) (91, 32) (6, 144) (23, 8, 1) (7) (15) (38) (80, 11) (9, 29) (20) (49) 
18 | 14/15 | (95, 61, 30, 54) (17, 13) (7, 6) (14, 3) (1, 5) (11) (59) (34, 52) (129) (111) 

18 | 11/15 | (76, 69, 119) (19, 50) (64, 12) (31) (21, 67, 112) (85) (39, 28) (11, 17) (135) (129) 
18 | 11/15 | (60, 29, 43) (15, 14) (1, 56) (16) (37, 39) (32, 5) (3, 11, 81) (8) (19) (51) 

18 | 10/11 | (129, 61, 70) (23, 29, 9) (20, 59) (17, 6) (11, 44) (28) (157) (5, 54) (49) (103) 


An analogous phenomenon is a compound perfect squared rectangle composed 
of a perfect squared subrectangle surrounded by squares. Here is an example, first 
found by Federico (private communication). The elements of the squared sub- 
rectangle are set off by square brackets: 


(390,389) ([3-14, 3-18], 293) (292,98) [3-10, 3-4] [3-7, 3-15] 
[3-9, 3-1] [3-8] (194), a 779 by 682 rectangle of order 15. 


1973] SQUARING RECTANGLES AND SQUARES 887 


We have no statistics corresponding to those in Table I relative to compound 
squared rectangles because of the duplications that may occur among the various 
collections of gnomons of a given order that we separately constructed. We did 
observe, however, that compound perfect rectangles with p + g < 30 and (p,q) = 1 
first occur with higher orders than in the case of simple perfect rectangles. 


6. Historical notes. Max Dehn [8] proved in 1903 that a rectangle can be squared 
if and only if its sides are commensurable (see [25,§3] for a modern version of 
Dehn’s proof). Yet no example of a perfect squaring was discovered until 1925 when 
Z. Moron [14] found the 32 by 33 simple perfect rectangle of order 9. In 1930, 
M. Kraitchik [13, p. 272] quoted the famous N. Lusin to the effect that there exists 
no perfect squared square. Kraitchik only knew nontrivial examples of imperfect 
squared rectangles as well. Apparantly Moron’s example was published in too 
obscure a journal (see also [7]). Finally, in 1939, the German geometer R. Sprague 
[15] discovered a compound perfect squared square of order 55 and side 5-16-29, 
after he had attempted to prove Lusin’s conjecture. Sprague built his example from 
two different 13 by 16 compound perfect squared rectangles and two squares. The 
next year Sprague [16] proved that each rectangle with commensurable sides has a 
perfect, perhaps compound, squaring, and, indeed, that each such rectangle has 
infinitely many totally distinct perfect squarings. 

Almost simultaneously with Sprague and independently the paper by Brooks, 
Smith, Stone, and Tutte [6] appeared. Bouwkamp [1,2] also found all the low order 
squarings of rectangles, but constructed no developed theory. Brooks et al did. 
They obtained Sprague’s result, developed the analogy with electrical networks 
(which shows that each squared rectangle has commensurable sides and subsquares), 
and proved much more: there exists no perfect squared rectangle of order 8 or less; 
there exist two simple perfect squared rectangles of order 9; there exists a simple 
perfect squared square of order 55 and a compound perfect squared square of order 
26 — they gave examples [6, p. 333 and p. 334]. Using the electrical network analogy, 
C.J. Bouwkamp et al [3] gave a catalogue of all simple perfect squared rectangles 
of orders less than sixteen. In 1948 T. H. Willcocks, then a clerk for the Bank of 
England in Bristol, discovered a compound perfect squared square of order 24 and 
side 175 [18, 23,24]. He holds the record still. 

Duijvestijn [9] extended Bouwkamp’s work. Federico [10,11], Brooks [5], 
Bouwkamp [4], and John C. Wilson [21,22] found interesting examples of squared 
squares and rectangles, mostly by computer. I. M. Yaglom [25] finally published the 
first book on squared rectangles. It contains much original material. 

The outstanding open question remains: how to find the perfect squaring of least 
order of a given rectangle with commensurable sides. Perhaps it will prove easier to 
estimate closely this minimal order. The perfect squared square of least order will 
soon be found if computational difficulties are overcome or if much faster computers 
are built that will compute much more per dollar. It may well turn out to be Will- 


888 N. D. KAZARINOFF AND ROGER WEITZENKAMP 


cocks’s gem: (64,56,55) (16,39) (38,18) (33,31) (3,4,9) (20,1) (5) (14) (30,81) (2,29) 
(35) (8,51) (43). 


References 


1. C. J. Bouwkamp, On the dissection of rectangles into squares, I-III, Nederl. Akad. Wetensch. 
Proc., 49 (1946) 1176-1188; 50 (1947) 58-71 and 72-78. 

2. , On the construction of simple perfect squared rectangles, Nederl. Akad. Wetensch. 
Proc., 50 (1947) 1296-1299. 

3, ———, A.J. W. Duijvestijn and P. Medema, Catalogue of simple squared rectangles of orders 
nine through fifteen, Department of Math. and Mech., Technische Hogeschool, Eindhoven 1960. 

4. , On some special squared rectangles, J. Combinatorial Theory, 10 (1971) 206-211. 

5. R. L. Brooks, A procedure for dissecting a rectangle into squares, and an example for the 
rectangle whose sides are in the ratio 2:1, J. Combinatorial Theory, 8 (1970) 232-243. 

6. R. L. Brooks, C. A. B. Smith, A. H. Stone, and W. T. Tutte, The dissection of rectangles into 
squares, Duke Math. J., 7 (1940) 312-340. 

7. S. Chowla, The division of a rectangle into unequal squares, Math. Student, 7 (1939) 69-70. 

8. Max Dehn, Uber die Zerlegung von Rechtecken in Rechtecke, Math. Ann., 57 (1903) 314-332. 

9. A. J. W. Duijvestijn, Electronic computation of squared rectangles, Thesis, Technische 
Wetenschap aan de Tech. Hogeschool te Eindhoven, 1962. 

10. P. J. Federico, Note on some low-order perfect squared squares, Canad. J. Math., 15 (1963) 
350-362. 

11. , Some simple perfect 2 x 1 rectangles, J. Combinatorial Theory, 8 (1970) 244-246. 

12. N. D. Kazarinoff and Roger Weitzenkamp, On existence of compound perfect squared 
squares of small order, J. Combinatorial Theory, B 14 (1973) 163-179. 

13. Maurice Kraitchik, La mathématique des jeux ou Récréations Mathematiques, Stevens 
Fréres, Bruxelles, 1930. 

14. Z. Moron, O rozkladach prostokat6w na kwadraty, Przleglad. Matem. — Fizyczny, 3 
(1925) 152-153. 

15. R. Sprague, Beispiel einer Zerlegung des Quadrats in lauter verschiedene Quadrate, Math. 
Z., 45 (1939) 607-608. 

16. ———, Uber die Zerlegung von Rechtecken in lauter verschiedene Quadrate, J. Reine Angew. 
Math., 182 (1940) 60-64. 

17. W. T. Tutte, The dissection of equilateral triangles into equilateral triangles, Proc. Cam- 
bridge Philos. Soc., 44 (1948) 463-482. 

18. , Squaring the square, Canad. J. Math., 2 (1950) 197-209. 

19, ———, Squaring the square, ‘“‘Second Scientific American Book of Mathematical Puzzles 
and Diversions,” by Martin Gardner, Simon and Schuster, New York, 1961, 186-209. Reprinted 
from Scientific American November, 1958, 136-142. 


20. , A theory of 3-connected graphs, Indag. Math., 23 (1961) 441-455. 
21. , The quest of the perfect square, this MONTHLY, No. 2, Part II, 72 (1965) 29-35. 
22. , Squared rectangles, Proc. I. B. M. Scientific Computing Symposium on Combina- 


torial Problems (March, 1964) 3-9, I. B. M. Data Processing Div., White Plains, N. Y. 1966. 
23. T. H. Willcocks, Problem 7795 and Solution, Fairy Chess Review, 7 (1948) 97, 106. 
24. , A note on some perfect squared squares, Canad. J. Math., 3 (1951) 304-308. 
25. I. M. Yaglom, How to cut up a square? (Russian) Nauka, Moskva 1968. 


WHAT EVERY COLLEGE PRESIDENT SHOULD KNOW ABOUT 
MATHEMATICS 


JOHN G. KEMENY, Dartmouth College 
(Invited Address, MAA Meetings, August, 1972) 


Let me start with a very brief remark on the nature of the college presidency. 
The best characterization occurred to me somewhat accidentally when I was speaking 
to our California alumni during the primary races last spring. I found some remark- 
able similarities between the activities the political candidates were engaging in and 
and what I was doing. Therefore, I should like to characterize the college presidency 
as the unique job in which you first get elected to office and then spend the rest of 
your time running for office. 

It is a very peculiar multi-faceted job and it is not clear what kind of training 
is really good for a college president. But having a mathematician in such an office 
is a sufficiently rare event in the history of higher education that it might be worth 
considering whether there are ways in which being a mathematician is helpful to a 
college president. Obviously, there are a great many activities college presidents 
engage in where mathematics is totally irrelevant, but the opposite question is an 
interesting one. Therefore I will interpret the topic proposed for me by Professor 
J. L. Snell to mean ‘‘Are there certain things about mathematics that all college 
presidents should know?’’ 

I shall briefly mention some trivia and then go on to a few interesting examples 
from my own experience. 

Most people would say that it helps a college president to be able to read a fi- 
nancial statement. That is actually not as trivial as it sounds, because the purpose 
of most college financial statements is to meet a legal requirement, but to make sure 
that no one who reads the statement can find out what its contents mean. Secondly, 
it is good to be able to check the arithmetic in financial statements. I managed to 
find two errors in my first year as president, and I notice that this has had a salutory 
effect on the care with which these statements are prepared. But, more important 
is the fact that mathematicians are usually very good at explaining complicated 
mathematical things in fairly simple language to a non-mathematical audience. 


John Kemeny was born in Budapest and did his undergraduate and graduate work at Princeton, 
where he received his Ph.D. He was in the US Army at the Los Alamos Project, and he served a 
year,as Albert Einstein’s research assistant before becoming a Fine Instructor, and later Assistant 
Professor of Philosophy at Princeton. He has served Dartmouth College as Professor, Chairman of 
the Mathematics Department, Albert Bradley Professor, and President. He is a fellow of the American 
Academy of Arts and Sciences and has served the MAA on several committees, on the Board of 
Governors, and as Chairman of the New England Section. 

He is the co-author or author of 13 books including the well-known Jntroduction to Finite Mathe- 
matics, Finite Markov Chains, Denumerable Markov Chains, and more recently Man and the Com- 
puter: A New Symbiosis. Editor. 


889 


890 J. G. KEMENY [October 


That turns out to be a very useful asset to a college president, because you can take 
a highly complex statement and translate it for the faculty, students, and alumni. 

I have a very strong feeling that decision-makers should not get their facts at 
second-hand. I practice that particular preaching, for example, by continuing to 
teach. I don’t like to get feedback on what students are thinking by having a student 
tell his professor, who tells a dean, who tells a vice president, who tells the president. 
A great deal is lost in translation along the way. And I find that being able to remain 
active in the classroom is by far the best way of having your finger on the pulse 
of the campus. Similarly, there are many decision-making problems, and I will 
show you some examples, where being able to deal with the facts at first-hand gives 
you a much better feeling as to what the problem is and what the possible solutions 
are. 

The most serious contribution that the mathematician-president can make is 
the fact that he knows something about model-building, and I would like to con- 
sider in detail some of the models that I have had to deal with in my first two and 
one-half years as a college president. 

At the beginning of my term of office I realized that Dartmouth was facing a 
serious problem as far as the number of people on tenure was concerned. The size 
of the Dartmouth faculty was significantly increased in the 20’s and the College 
was very good to its faculty during the depression and the war years, as a result of 
which in the early 50’s the same faculty was still around and John Dickey, the pre- 
vious President of Dartmouth, faced a horrendous re-building problem. Within a 
decade 80% of the permanent faculty would retire. He did a great job in that rebuild- 
ing, but it is almost impossible under those circumstances to avoid repeating past 
history. Once again you build up a strong young faculty, have them around for 30 
more years, and face an impossible tenure situation 30 years later. Although the 
problem wasn’t quite that extreme, it had to be tackled right away. 

Let us consider a very simple model of the problem of ranks and tenure. A typical 
pattern of progression through the academic ranks is shown in Table 1. There is 


TABLE 1 


Instructor 2? Years 
Asst. Prof. First Appointment 3. Years 
Asst. Prof. Second Appointment 3 Years 
Associate Prof. 6 Years 
Professor 24 # Years 


a rather natural At for the model, namely 3 years. Furthermore, instructors are 
so ill-defined and mean so many different things that I am going to forget them in 
the simplified model. For the question of tenure, it’s the non-tenure ranks versus 
the tenure ranks that are significant, so I'll lump the two tenure ranks together 
into one of 30 years average duration. The result is shown in Table 2. The actual 


1973] WHAT EVERY COLLEGE PRESIDENT SHOULD KNOW ABOUT MATH. 891 


model we worked with was more sophisticated, but the essence of what we did can 
be brought out in this simplified model. 


TABLE 2 


API (Asst. Prof. First Appointment) 3 Years 
APII (Asst. Prof. Second Appointment) 3 Years 
Tenure 30 Years 


Let us consider this model in some detail. What should the strategy for pro- 
motion and tenure be? You have three boxes into which you put not balls but faculty 
members. (I’m sorry—I’ve worked on probability problems too long.) The first 
one is the first appointment at assistant professor, the second one is second appoint- 
ment at assistant professor, and the third is the tenure box with, say, x, y, and z 
people; and you have transition probabilities. With our three-year time cycle, and 
three-year appointments, everybody moves out of the first box in one time period. 
Say a fraction p, is reappointed, a fraction 1 — p, leaves the institution, voluntarily 
or otherwise. In the second box a fraction p, is promoted, this time to tenure, and 
a fraction 1 — p, is going to leave the institution. From tenure (since I have lum- 
ped together all the tenure ranks there is no promotion) some fraction p3 will 
leave. This includes retirements, deaths, and going to another institution because 
of a better offer. The transition diagram is shown in Table 3. 

One can control p, and p, by institutional policy. One really has little control 
over p,. The problem is to relate the choice of parameters to the two most interesting 
quantities: the fraction of faculty on tenure and the probability that a new assistant 
professor will reach tenure. 

The equilibrium conditions are easily found. For simplicity I’1l assume that the 
total number of faculty members is fixed during this period, so the number coming 
in will have to equal the number going out. Let n new faculty members come in 
during 3 years, all as API. Since everybody leaves the first box and all the new ones 
come in at the beginning level, n must equal x. Next, from the first box p,x goes to 
the second box, and everybody from the second box leaves at the end of three years, 
SO p,x must equal y. And since p,y is the number entering the tenure ranks and p,z 
is the number leaving, these two numbers must be equal. There is an additional 
condition that the number of new people coming in must exactly equal the numbers 
that go out, but that turns out to be a consequence of the other three equations 
and is therefore redundant. 

What is the probability that someone coming in as a new assistant professor 
is going to reach tenure? With probability p, he will get promoted once, with proba- 
bility p, he will get promoted twice and therefore, on the average, p,p, is the fraction 
that is going to reach tenure from an initial appointment. Clearly, if you are going 
to care about your faculty members you’d like that quantity to be high. The tenure 
ratio is simply the number of tenure people divided by the total number, so it is 


892 J. G. KEMENY [October 


z/(x +y +z). These results are summarized in Table 3. 


TABLE 3 PROMOTION STRATEGY 


AP I AP Il Tenure 
n P,X Psy 
——_» x —_—_»> y —_> z 
(1 ~ P,)x (1 — P,)y Pz 
Equilibrium: n=x P\X=y Poy = P3z Prob. of Tenure = P,P2 
Zz 

Tenure Ratio = ———___—- 

x+y+2z 


Now the question is what can you do with the amount of freedom you have 
and what are the implications of various policy decisions? From the equilibrium 
equations we can express y and z in terms of x. Substituting in the formula for the 
tenure ratio and simplifying we find that 


PiP2 


Tenure ratio = ——————~—__. 
P3 + PiP3 + PiP2 


Note that the probability of reaching tenure, p,p,, occurs in two places in the 
formula. 

Our aim is to make the quantity p,p, as high as possible, and to have the tenure 
ratio as low as possible. Suppose you have set the value of p,p., as high as you can, 
what are the remaining quantities? They are all in the denominator, and all with 
plus signs; so you would like to make them large in order that the tenure ratio be 
low. Now we’d love to fiddle around with p, but that’s a dangerous business. Clearly 
the number of people who retire through mandatory retirement is fixed. In equi- 
librium, if we are talking about 30 years’ tenure, in a three-year period roughly one- 
tenth will retire. In addition to that some fraction will leave voluntarily, say another 
5%, SO p; might be something like .15. But beyond that, all you can do is make 
life miserable for senior faculty members so that they will decide to leave the in- 
stitution. Let us, therefore, assume that you have no control over p,. That still 
leaves you p,, and the first surprising conclusion is that p, should be as high as 
possible. 

This calculation quite clearly shows there is every advantage to making p, as 
close to | as possible (for simplicity I?ll use the value 1, though in practice that 
isn’t likely). That is, a first term assistant professor should have a very easy time 
getting a reappointment. Obviously this is a very popular decision in a difficult 
job market. Setting p, = 1 will both help you in increasing the probability of reach- 


1973] WHAT EVERY COLLEGE PRESIDENT SHOULD KNOW ABOUT MATH. 893 


ing tenure and help to decrease the tenure ratio. So here is a nice counter-intuitive 
result which actually led to a change of policy at Dartmouth College. 

With p, = 1, the tenure ratio comes out to be p,/(2p, + p.) and since you have 
very little freedom about p;, the ratio is determined by p,. Substituting a rough 
value .15 for p3;, we obtain a tenure ratio of p,/(p,+.3). The ratio is mono- 
tone increasing in p,. And since p, = 1, the probability of reaching tenure is 
precisely p,. Now you have a conflict. On the one hand, to protect the future of 
the institution, you don’t want the ratio to rise too high; on the other hand, you 
would like to make p, as high as possible, to make the institution attractive to new 
faculty members. I’ll give you two typical values. If the tenure ratio is 50% the 
probability of reaching tenure is only 30% according to this model. On the other 
hand, if the tenure ratio goes up to 60%, then the probability of reaching tenure 
in equilibrium is 45%, which is reasonable for a new assistant professor at a good 
institution. While we did not have an equilibrium situation at Dartmouth and there- 
fore our value is not as favorable, a calculation like this persuaded us to allow the 
tenure ratio to creep up over a decade to 60% so that we can give a decent chance 
for new assistant professors to stay here. 

That’s my first example of a model. The model is not terribly complicated; it 
does not use advanced mathematics; and yet it uses the kind of argument that 
someone not trained in mathematics is not likely to come up with or even be able 
to follow. Therefore I think it is an example that is legitimate under the topic ‘‘What 
mathematics should all college presidents know?’’ 

Let me go to a second model, which I’1] do more briefly. We are in a period 
of expansion in the Dartmouth faculty since we are about to go on year-round 
operation. That raised the question of how new faculty should be distributed amongst 
the existing departments. That’s an old debate on almost every academic campus 
and any time you try to come up with an agreement as to what’s an equitable for- 
mula, you have a losing fight. I know because I tried to get agreement and [ lost 
the fight! 

But it is still an interesting problem and therefore I asked myself a slightly dif- 
ferent question: “‘Can one give a rational reconstruction for the way we are now 
assigning numbers of faculty members to departments?’’ One knows that the process 
is not totally rational. A great many conditions influence it; the persuasiveness of a 
departmental chairman, the prejudice of a dean and accidental conditions have an 
effect on the size of the departments. Nevertheless, can one look at the facts, identify 
those factors that could influence such a decision, and come up with a rational 
reconstruction as to how the assignments were made? 

We tried coming up with a set of relevant factors. At first we identified too many 
factors, and if you have too many possible factors you can explain absolutely every- 
thing, including things that are just plain wrong in the system — you pick up a great 
many accidental correlations. In a way the only difficulty was to cut down the number 
of factors to a reasonable list which still gives you a good explanation (but not too 


894 J. G. KEMENY [October 


good). We came up with seven factors (see Table 4) in terms of which we got an 
amazingly good linear fit on the distribution of faculty members. 


TABLE 4 — FACULTY LOAD FACTORS 


1. Students in regular sections 
2. Students in lecture sections 
. Students in labs 

. ““Must be small’ sections 

. Majors 

. Graduate students 

. Constant 


SIN NA BW 


The method of ‘‘best linear fit’’ assigns coefficients C,,C,,---,C, to the seven 
factors so that the linear combination ‘‘fits’’ the actual number of faculty members 
as Closely as any linear formula can. The coefficients are the weights assigned to the 
various factors, and their interpretation is quite interesting. I shall discuss several 
coefficients, taking the liberty of rounding the answers. 

The coefficient C, may be interpreted as assigning one full-time faculty member 
for each 150 students in ‘“‘normal’’ sections. Thus a department ‘‘earns’’ a faculty 
member for teaching six sections of 25 students each or for five sections of 30 students. 
The value of C, is smaller; there must be 250 students in lecture sections before 
the department is given a full faculty member. Additional faculty is assigned for 
lab sections, for supervision of majors and of graduate students (C3, C; and C,). 
Coefficients C, and C. deserve special mention. 

You clearly don’t want to reward a department simply for giving many small 
sections, on the other hand, there are departments that are forced to give small 
sections. For example, the Faculty at Dartmouth has voted that a Freshman Seminar 
cannot have more than 15 students in it, and clearly this must be taken into account 
in the loads of departments. A department is assigned one faculty member for every 
12 sections that must be small. To interpret this we must recall that our formula 
is additive. For example, if a department has 150 students in 12 sections that must be 
small, it will be given one faculty member for the 150 students taught, and an addi- 
tional faculty member for the 12 special sections. This seems to be fairly equitable 
recognition of the extra work involved in teaching many small sections. 

Let us now look at the constant term C, = 1. That is, each department is given 
one faculty member irrespective of load. I think the interpretation is simply the over- 
head of having a department. So, if this reconstruction is at all reasonable, every 
time you create a new department, you commit something like one full-time faculty 
member just because it is a separate department, quite independently from any 
teaching load the department may carry. 

This reconstruction (and let me remind you again that nobody actually makes 
the decisions this way) is a rational reconstruction of how we could explain the size 
of different departments. You could pick up a certain number of faculty members 


1973] WHAT EVERY COLLEGE PRESIDENT SHOULD KNOW ABOUT MATH. 895 


because of a certain number of students, plus some additional ones because of 
students electing a certain special section, plus additional ones because of majors 
or graduate students, plus one for each department, etc. Although this reconstruction 
came out automatically from a computer run, the weights correspond reasonably 
well with our intuition as to how we might have done this assignment. Therefore 
there is something to it. And we found one other very convincing piece of evidence. 
The formula doesn’t of course fit things perfectly, and there were three or four 
notable examples of departments having too many faculty members, or too few, 
and in every case where that happened, they were departments where the deans 
knew that the department was either overstaffed or understaffed. 

To me it is quite surprising that you can do that wellin a rational reconstruction, 
and we are planning to use this to monitor the growth of the Dartmouth Faculty 
in the next few years, as enrollment increases, and as there is the possibility of a shift 
in enrollments because of the presence of a significant number of women. (Though 
I suspect the shift will be less than most people predict—I think students take 
courses because of the reputation of departments on campus and not because they 
are men or women.) But whatever may happen, this model will give us a check as 
to whether we are allocating enough faculty members to departments that are being 
heavily hit by new enrollments. 

So this is a second mathematical model, of a somewhat different type, that has 
a great deal to do with long-range planning at an institution. As a third example 
I picked a combinatorial problem arising from our decision to go on year-round 
operation. Why are we going on year-round operation, aside from the fact that, 
as you can see, Hanover is a very beautiful place in the summer? The goal is to 
accommodate, in addition to 3000 male undergraduate students, about 1000 women 
students and to do that without building any additional buildings —for the simple 
reason that we can’t afford it. The idea was to spread out the academic year from 
three terms to four terms and accommodate a larger number of students in the same 
space (but with a larger faculty). 

The problem was how to design a suitable calendar for student attendance. 
There was universal agreement on the fact that they should come for a normal 
freshman year of fall, winter and spring (see top of Table 5). And then the idea 
was that with twelve terms required for graduation, they would have nine more terms 
to choose out of twelve. There are 220 possible calendars, and that seems like an 
enormous amount of choice. Surprisingly, that answer turned out to be wrong. 

dn the figure 220 we have counted all possible combinations of how students 
could elect a schedule without taking into account whether they are reasonable. 
For example, we have counted the plan where after the freshman year the student 
takes off three terms and then goes nine terms in a row. Presumably no student 
will elect that plan. Therefore, some computer work was done (you can do it by 
hand, it just takes a long time), to try to put some conditions of reasonableness 
on the solutions, and to try to find out how many plans look at all attractive. The 


896 J. G. KEMENY [October 


TABLE 5 — DARTMOUTH PLAN 


Frosh 


Soph 


Junior 


Senior 


Frosh 


Soph 


Junior 


Senior 


rather surprising resuJt was that starting with 220 possibilities, the number of reason- 
able plans was not high enough to make this plan acceptable. Of the roughly one 
dozen attractive plans the majority has the student on campus every winter term, 
which would not allow us to increase the enrollment. 

The next proposed solution, and still my favorite, is not the one we have adopted. 
An excellent faculty-student committee came up with a single calendar plan and 
tried to persuade us that every student ought to go on this plan. (See the bottom 
of Table 5.) It’s a marvelous scheme. It meets the boundary condition of a standard 
freshman year and the boundary condition of all students graduating in the spring 
of the senior year; it gives two six-month vacations, which is a very attractive option 
—students can get interesting jobs in that time period. It reduces the total number 
of terms from 12 to 11. And it has the marvelous feature that in fall, winter and spring 
only three of the four classes are present; therefore you can increase the total student 
body by one-third and still use the same space. 

It was an extremely tempting plan, but the faculty did not buy it—for quite 
good reasons. The thought of going to something strange and have it totally required 
bothered the faculty, not to mention such shortcomings as the fact that there would 
be no sophomore football players, and no junior hockey and baseball players on 
campus. That’s where reality impinges on combinatorics! (To be fair, the committee 
proposed a solution to this problem, but it was not a popular one.) 


1973] WHAT EVERY COLLEGE PRESIDENT SHOULD KNOW ABOUT MATH. 897 


Nevertheless, the much greater choice under an 11-term plan led the faculty 
to a reduction of the graduation requirement to 33 courses. Since 32 courses is 
a typical requirement at many comparable schools, this was entirely reasonable. 
But the faculty opted for a maximum amount of freedom of choice for students. 
While the very attractive plan in Table 5 is a legal option, it is only one of many. 
Within certain limits each student would be allowed to design his own pattern for 
going through college. Every student will be required to come at least one summer, 
and we will reserve the right that if everyone wants to come in the same term, some 
students must take a second choice. The price of greater choice is that instead of 
increasing the student body by 33% we expect to increase it by about 25%. 

I have shown you two numerical models and one combinatorial model. I would 
now like to show you a dynamic model, dealing with the use of endowment income. 
I have never seen this result in print, although once you have formulated the problem, 
the solution is quite trivial. The fact that such a basic result is not in general use 
may have something to do with the fact that the presence of a mathematician in 
the president’s office is a rare event in American higher eductaion. 

Table 6 shows the three basic variables X, E and U, and six parameters. It 
is the interrelation of these quantities that I want to explore. Let us consider the 
six parameters. We assume an equilibrium situation in which these parameters 
remain constant. R, is the rate at which expenses grow annually, the fundamental 
quantity in budget control. R, is the rate at which you want your endowment to 
grow. R, is a crucial number, the total return you get on your endowment. In 
modern investment policy we don’t care what comes in as dividends and what comes 


TABLE 6 — ENDOWMENT USE 


X = _ Expenses 

E = Endowment 

U = Endowment utilized 

R, = Rate of growth of expenses 

Ro = Rate of growth of endowment 

R3 = Rate of total return 

Ra = %of endowment used 

Rs = Newendowment as % of old 

Re = % of expenses covered by endow. 


in as Capital growth; all you care about is the long run total return. With good 
investment management you should try to achieve something like 9° per year. 
R, is the percentage of the endowment that you use this year. R; is new endowment 
as a percentage of old endowment. Each year you get some new gifts. Although 
the amounts vary from year to year, the Dartmouth experience is that they represent 
a fairly steady percentage of your actual endowment, something like 2%. And finally 
Rg is the percentage of the expenses covered by your endowment. 


Table 7 shows a simple model containing the interrelations of the variables. 


898 J. G. KEMENY [October 


Equations (1)-(4) are, in effect, the definitions of R,, R2, R4 and R¢, respectively. 
Equation (5) is a balance sheet for endowment funds. The endowment next year 
will be the endowment this year plus the total return you get on it, to which you 
add new endowment that comes in in the form of gifts, and from which you subtract 
that which you have utilized. It is an absolutely straight-forward equation. What 
can one deduce from this simple model? 


TABLE 7 


(1) Xn+1 = Xp me + Ri) 

(2) Ent1 = En’ (1. + Ra) 

(3) Un = En’ R4 

(4) Un = Xn° Ro 

(5) Enti = En’ (1 + R3) + En* (Rs) — Un 

(6) Xn = Xo° (1 + Ri)” 

(7) En = Eo° (+ R2)" 

(8) Eg Ra° (1 + Ro)? = Xo Ro? (1 + Ri)” 

(9) En’ (1 + R2) = En 1 + R3) + En? Rs — En’ Ra 


First of all, equations (1) and (2) lead to equations (6) and (7). We substitute 
these values into equations (3) and (4), setting the two values of U, equal to each 
other, obtaining equation (8). If this is to hold for large n, we must have R, = R,. 
That says that in the long run, if you want to get anything like an equilibrium situa- 
tion, your endowment must continue to grow at the same rate that your expenses 
do. Obviously, for a few years you can violate this rule, but there is no escape in 
the long run. 

A more interesting result is obtained from equation (5). We may use equations 
(2) and (3) to express all quantities in terms of E,, as shown in (9). From this we 
deduce the very simple equation that R, + R, must, for any equilibrium solution 
equal R, + R;. Here are the two major findings: 


(10) R, = R, 
(11) R,+ Rg = R,+ Rs. 


In the second equation we have replaced R, by R,. Incidentally, the results are in- 
dependent of R,, the percent of expenses covered by endowment. (This differs 
greatly from institution to institution; at Dartmouth it is about 1/3, at Harvard 
it seems to be 110%.) 

These equations were derived from very simple-minded mathematics, but are 
most useful for long-range planning. The right-hand side of (11) contains your 
assets. R; is the rate of total return, say 9%, and R; is the percentage of new gifts, 
about 2% , adding up to about 11%. So you have to take a guess, if you are a trustee, 
as to what the sum of these two numbers will be and that tells you what you can 


1973] WHAT EVERY COLLEGE PRESIDENT SHOULD KNOW ABOUT MATH. 899 


do. The left-hand side of (11) says that you must make an allocation of this available 
growth between R, the growth of endowment and R, the amount used currently. 
So here you have a conflict. You have a fixed total available to you and the more 
you use currently the less the endowment is going to grow, and the more it grows 
the less you can use currently. Put that way it is obvious, but, that it is a simple 
additive relationship, frankly surprises me. 

The most fundamental decision for the institution is how to apply equation (11). 
At the present time Dartmouth estimates R, + Rs; = 10.5%, and we are currently 
using Rg = 4.5%. This allows R, = R, = 6% for growth of expenses (and for the 
growth of the endowment). This is the kind of analysis that all boards of trustees 
ought to engage in, but to my knowledge they do not. 


I would like to make a few remarks about the use of computers. Some of you 
may be surprised that I stayed away from computers this long in my talk. They 
are certainly not irrelevent to the job of the president. I personally, as many of you 
know, feel that the computer, particularly a good time-sharing system with its ease 
of access and very great ease of forming models, is an absolutely indispensable tool 
for education and I am beginning to see that it is also indispensable for college 
management. 


Computer usage starts with absolutely trivial applications. I’1l tell you a story 
of a nice opportunity to show off before the Trustees. It will be obvious to you © 
that what I did was trivial, but it is not obvious to non-mathematicians. We were 
trying to figure out how fast tuition should increase and one of the Trustees asked 
how fast it has been increasing in the last decade. It was a reasonable question 
but none of us knew the answer. The Treasurer fortunately knew the present tuition 
and what tuition was ten years ago. That left the simple problem of finding out the 
rate of growth over ten years. You just take the ratio and extract a 10th root. I’m 
not going to say I did that in my head! But I happen to have a computer terminal 
in my office, and needed only a two-line program to solve the problem. So we inter- 
rupted a Trustee meeting for approximately 45 seconds so I could tell the Trustees 
that tuition at Dartmouth College had increased by exactly 6.3°% per year, com- 
pounded, in ten years. 


A second example is too complicated for detailed description in this talk. It is 
the question of how you design a loan plan for students, similar to that pioneered 
by Yale. Most people have a feeling that the Yale Plan is not quite right, but we have 
yet to demonstrate that a more attractive one exists. You run into some very com- 
plicated mathematical problems in designing a loan plan and computer simulation 
is almost indispensible. For example, everyone talks about how many years it will 
take until the plan ‘‘turns around”’, 1.e., the repayments start exceeding the amount 
of money being borrowed. But it is far from obvious that the loan program ever 
turns around. On highly reasonable assumptions the amount borrowed increases 
faster than repayments. Or the debt outstanding is so high that the interest on it 


900 J. G. KEMENY [October 


makes catching up impossible. It is interesting to calculate what the maximum debt 
outstanding is; it is quite possible that if all universities in the country go on a rea- 
sonable loan plan at reasonable rates, then before the end of the century all the 
money in the world will be out in the form of loans to college students! 

My final example is the use of computers for the dual purpose of a good manage- 
ment information system and for long-range planning models. I’ve had a great 
deal of frustration in that information is not available to decision-makers. As a 
faculty member I firmly believed that the administration was hiding facts from the 
faculty. After I got into office I found out that the facts were not available to the 
President either. The chances are that if you ask an office for factual information, 
they either can’t find the facts or you are presented with a computer printout a foot 
high. I have a general rule that anybody who produces a computer printout that 
high must be doing the wrong thing. So the goal is to make information easily 
accessible and computer-readable. 

For example, before I became President I served years on the committee on 
tenure and promotion. Each year we needed a list of associate professors who had 
been at least four years in rank, and wanted to know how long they had been in 
rank, what departments they were in, and how old they were. It seems like a modest 
enough request, but some poor secretary had to plow through a computer printout 
a foot high and spot associate professors, try to do a mental calculation as to whether 
they had been in rank four years, do another calculation on how old they were, 
and so on. Typically ten percent of the information was wrong. In one document 
the age was stated variously as 30, as 32 and as 35. I claim that a 5-year error on 
somebody who turned out to be 30 years old is more than three standard deviations! 
So our first goal is relatively modest, just to be able to lay one’s hands on such 
information. If I had a computer terminal here I could show you that a good part 
of this is now working and is very helpful. 

Dartmouth has a secret weapon in computer developments. Starting with the 
development of our original time-sharing system, since we couldn’t afford to employ 
professionals, we have employed our own undergraduate students. We learned that 
they are often better than the professionals we could have hired. This has been our 
secret in developing one of the least expensive computing systems in the United 
States and, I think, one of the best. So when we went to a management information 
system, several faculty members worried about the design and five students did all 
the initial implementation. This is why in record time we have the beginnings of a 
management information system. 

The second part of the project is going to be much harder and I don’t really 
know what we are going to do. I feel a strong need for a model for long-range plan- 
ning. I have looked at models at several other institutions and I’m sorry to say 
that I don’t see their usefulness. They may be useful for justifying a budget to a 
legislature, but they have very little to do with real expenses. 

The problem is that most models assume that expenses are linear and I haven’t 


1973] WHAT EVERY COLLEGE PRESIDENT SHOULD KNOW ABOUT MATH. 901 


yet found anything at Dartmouth that is linear. Let me give you the simplest example. 
In these models an important item is the cost of the library per student. I claim 
that the number of students has absolutely nothing to do with what the library 
costs! Dartmouth College happens to have a library with one million volumes in 
it, of which we are very proud. But I claim that if we had half as many students 
or twice as many students we would still have a collection of one million volumes. 
Perhaps the number of faculty members has somewhat more to do with costs, but 
only marginally. Basically, we have a million volumes because we’re trying to main- 
tain a first-rate library. Therefore any changes are really second order perturbations 
on that number. 

So my big dilemma is how one builds a mathematical model which shows true 
interrelationships amongst parts of the institution, and not just in terms of money. 
What is the effect of new students on number of faculty members, and in turn what 
is the effect on space needs and how many more janitors we are going to need and 
how much more lawn has to be mowed, etc.? These questions are not even vaguely 
understood in academic life and yet someone has to come up with a suitable planning 
tool. Money is hard to get and to do well academically we must tighten up on our 
planning processes. But without rational tools I think we are going to make major 
goofs. Incidentally, if any of you have any good ideas on how to build such a model, 
I’d love to hear from you. 

What conclusions can I draw from these examples? Let me start with a trite 
statement: planning the future of a modern university is a very complex enterprise, 
and the role of the president seems to be quite different from what it was 50 years 
ago. While I said at the beginning that for a great many activities being a mathemati- 
cian is notrelevant, there are a surprising number of mathematical-flavored problems 
that come up in the decision-making process. As I think you see from the examples, 
none of them require highly advanced mathematics —I didn’t use homotopy theory 
once in any of my examples—for that matter I didn’t even use calculus. Neverthe- 
less, they are sufficiently complicated mathematically that the intelligent layman 
is not able to deal with them. Therefore, a college president either needs to be a mathe- 
matician or he needs someone first-rate on his planning staff who has considerable 
mathematical sophistication. And even if he has a mathematical planner, he has the 
handicap that he gets his information at second-hand. 

When I’ve said all that, I have the feeling that I did not tell the full story of 
the relevance of mathematics for the college presidency. The college president spends 
an enormous amount of his time, when he is not doing junk, solving problems. 
I think if there is a single important thing that all college presidents should know 
about mathematics, it is how to think like a mathematician. A mathematician, or 
someone well trained in mathematics, has a special knack at analyzing problems 
and in knowing how to attack their solutions. Therefore, as a college president 
I am most grateful for being a mathematician. 


SOME MATHEMATICAL VERSES 


There once was a Norbert named 
Wiener 

Whose mind just couldn’t be keener 

But he’d chant and recite 

Verses so trite 

That we wished he’d sing less and 
obscener. 


A Valentine 

My valentine I send to thee 

With love and osculation 

Dear love let’s have no more harmonic 
separation 

Love’s waves abound today 

And I await thy reciprocation. 


* * 


In the Greek mathematical Forum 
Young Euclid was present to bore’em 
He spent most of his time 

Drawing circles sublime 

And in crossing the pons asinorum. 


* * * 


A mathematician named Klein 
Thought the Moebius band was divine 
Said he, “If you glue 

The edges of two 

You'll get a weird bottle like mine.” 


* * * 


Automation is vexation 
Quaternions are bad 
Analysis situs is detritus 

I wonder, have I been had? 


* * * 


A quadratic function ambitious 
Said it’s not only wrong but it’s vicious 


It’s surely no sin 
To have both max and min 
To limit me so is malicious. 


* * 


Here’s a toast to L. J. Mordell 
Young in spirit, most active as well 
He'll never grow weary 

Of his love, number theory 

The results that he gets are just swell. 


* * * 


It used to be fun 

To add one and one 

But now I’m _ unsure 

What sum to secure 

I’m told it may even be none. 


* * * 


There once was a function of x 
With deplorable notions of sex 
In a half-Baire condition 

It attempted coition 

With a function weakly convex. 


* * * 


Here’s a toast to Leo H. Moser 
A truly delightful exposer 

Of things mathematical 

With a flair for dramatical 
With him you cannot be a poser. 


* * * 


A question both real and complex 
Is whether a sphere is convex 

That problem proposer 

The famed Leo Moser 

Believes it depends on the sex. 

(S. Golomb) 


* The verses here were found in papers left by the late Leo-Moser. They were communicated 
to the Monthly by his brother, William Moser, of McGill University. 


902 


REMARKS ON WOMEN IN MATHEMATICS 


ASSOCIATION FOR WOMEN IN MATHEMATICS, Philadelphia Chapter, Temple University 


As the Philadelphia Chapter of the Association for Women in Mathematics, we 
feel it is not only appropriate but necessary to respond to the article by University 
of Pennsylvania professor Murray Gerstenhaber in the June-July issue of the Mon- 
THLY, 1972. 

The article presents stereotyped, derogatory and negative views of women in the 
mathematical, academic and professsional world. Professor Gerstenhaber seems 
to regard childrearing as an insurmountable obstacle to women’s intellectual and 
professional achievement. He completely ignores the fact that many women contin- 
uously pursue challenging careers and make significant contributions to their chosen 
professional fields. He would have women channeled into careers in recently mathe- 
maticized fields where their “principal expertise (would consist) in using a computer 
cleverly.” The reasoning is that they can continue such careers from their homes, 
to which they are presumably tied by maternal duties. But women are by no means 
so restricted. 

Many women continue working while their children are small.’ Many who do 
interrupt their careers to raise children want an alternative. But why relatively 
undemanding work to be done at home? To propose this as the only course open to 
them is to ignore large changes going on in society. Childcare centers are coming 
more and more to be perceived as potentially enriching to the child as well as liberat- 
ing to the parents.7 More and more couples, especially in academia, find that by 
sharing family responsibilities both are able to pursue their careers.* In many quarters 
the traditional, confining concept of woman’s role is being challenged. 

In light of this, we feel Professor Gerstenhaber has made the wrong prognosis, 
not only about the careers women will choose, but also about the appropriate res- 
ponse for mathematics departments to the expected influx of women. His solution 
is to “omit proofs,’ that is, to lower standards. 

Mathematics departments have always functioned to teach skills to students 
in other disciplines, e.g., engineering. Why is it now, with women anticipated in 
large numbers, that we must suddenly begin to omit proofs? The implication, in- 
tended by the author or not, is that women can’t do mathematics as well as men. 
Another implication scarcely avoidable when computer-oriented careers are advoca- 
ted for women is that it is a good choice for them because of its routine nature, the 
work being often that of a technologically trained secretary, and again women are put 
in the same limited slot of accessory that they’ve always been in. 

The math students of the future may be predominantly women, but we cannot 
foresee the necessity for a “freshman course ... taught like arithmetic in grade school.” 
There are contributions to be made in advanced theoretical research, and some, we 


903 


904 JOHN RIORDAN [October 


are sure, will be made by women who have been taught something other than the 
techniques of computer programming. 

Mathematics departments may be expected to offer courses to those of both 
sexes who desire only to be trained for computer applications, but we feel that no 
self-respecting mathematics department will ever feel the need to lower its standards 
in order to educate its women students. 


NOTES 


1. For example, of a group of 1957-58 Radcliffe Ph.D.s, 91° were employed in 1964 and 79% 
had never had any interruption in the full-time pursuit of their professions. The time period of the 
study corresponded to the child rearing years of most women. See Patricia A. Graham, ‘‘Women in 
Academe,” Science, Vol. 169, No. 3995, 1284-90. 

2. It is now widely accepted that children do not suffer if their mothers work. Rather it is the 
quality of attention the child receives which influences his or her emotional development. See Alice 
Rossi, ““Women in Science, Why so few?’’ Science, Vol. 148, No. 3674, 1196-1202. 

3. ‘‘Marriages in Academia Reflect the Changing Status of Women,”’ New York Times, Nov. 13, 
1972. 


MATHEMATICAL NOTES 
EDITED BY ROBERT GILMER 
Material for this Department should be sent to David Roselle, Department of Mathematics, 
Louisiana State University, Baton Rouge, LA 70803. 
A NOTE ON CATALAN PARENTHESES 
JOHN RIORDAN, The Rockefeller University 


E. Catalan [1], in 1838, proposed and solved the problem of finding the number 
of ways of evaluating the product of n factors (in fixed order) by successive multi- 
plications operating always on two adjacent factors; the number in question is 


C,-1, Where 
2n 
_ -1 


Accounts of this work appear in [2] and [5]. For n = 4, the five ways are: 
((ab) (cd)), (((ab)c)d), ((a(be))d), (a((be)d)), (a(b(cd))). 


The first of these differs from the others in having two multiplications of the given 
factors; the other four, following [3] and [4] may be called ‘‘nests’’, and in the same 


1973] MATHEMATICAL NOTES 905 


spirit the Catalan parentheses are ‘‘clutches of nests’’. What is the number of Catalan 
parentheses of n factors with k nests? 

Write c,(x) for the enumerator of clutches by number of nests (the coefficient of 
x" in c,(X), Cy,4 iS the number of parentheses of n factors with k nests). Then, following 
Catalan, 


(1) Cy(X) = C4(X)Cy— 1(X) bor Ci()Cq— i) beh Cy—1(X)Cy(X) 


with initial values c,(x) = 1, c,(x) =x. If c(x, y) = %,=1y"c,(x), it follows from (1) 
and the initial conditions that 

C(x, y) = y + xy? + yey(x) (C(x, y) — yes (X)) + y*ea(x)e(x, y) + + 
2 ; 
0) + yle(xels y) + = y +(e — Dy? + e2(x, 9). 


The solution of (2) satisfying the initial conditions 1s 


(3) ce(x, y) =4[1 - (1 — 4y — 4(x — Dy”)? ] 
or 
(3a) c(x, y) = (y + (x — Dy)e(y + (x — Dy’), 


with 2yc(y) = 1 — J1-4 y; c(y) is the generating function of the Catalan numbers 
c,, defined above. 
Expansion of (3a) leads at once to 


(4) c(x)= ¥ (a cee n=1,2,-- 
k=0 


As a verification, note that by (4) c,(1) = c,_. 
Further results seem to flow more easily from (2). First, denoting partial deriva- 
tives of c(x, y) by suffixes, it follows from (2) that 


[1 ~~ 2c(x, y) |e.(x, y) = y’, 


5 

°? [1 — 2c(x, y) ]e,(x, y) = 1 + 2(x — Dy. 
Hence 
(6) [1 +2(x — Dy ]e.(x, y) = yey(x, y) 
and, with primes denoting derivatives, 
(7) Cn(X) + 2x — Lcq—1(%) = (2 — 1)ey-1(X) 
which implies 

(8) ken p= 2kCy_ 4 + (n + 1 — 2kK)cy—1,4-1- 


The first few values of c,(x) are: c,(x) = 1, c,(x) = x, €3(x) = 2x, c4(x) = 4x + X?. 


906 E. J. SCOTT [October 


It is apparent that c,. = 06,, (Kronecker delta), and that the degree of c,(x) is the 
integral part of n/2. 

The instance k = 1 of (8) and c,9=6,,, show that c,,=1 and c,,= 2c,_1.1, 
n = 2,3,-:-; hence c,,= 2" 7, n = 2,3,-:-. Similarly it is found that 


— 2 _ 
me ("5 yn ",  n=4,5,6,-- 


which leads to the guess 


_ n—2 n~-2k 
Cnk = (3.2)? Ch-1 


which satisfies (8) and the boundary conditions. 
Thus finally 


_ . n—2 n—2k k _ fa _ 
(9) c,(x) = x (oi ,) 2c, 1x", M= EF n= 2,3,-+-. 


k=1 


An interesting aspect of (9) is the identity 


(10) Cra2(l) = Cyay = E (7% | 27-7, 
k=0 \ 2k 


given by Jacques Touchard [ 6], in 1928, and now acquiring a combinatorial meaning. 


References 


1. E. Catalan, Note sur une équation aux différences finies, J. Math. Pures Appl., (1) 3 (1838) 
508-516. 

2. Louis Comtet, Analyse Combinatoire, Paris 1970; Tome Premier, p. 64 ff. 

3. G. Eldredge, Nesting habits of the laddered parenthesis, Problem E 1903, this MONTHLY, 73 
(1966) 666; M. Goldberg, incomplete solution, ibid. 77 (1970) 525-526; E. F. Schmeichel, comment, 
Ibid. 78 (1971) 298. 

4. Richard K. Guy and J. L. Selfridge, The nesting and roosting habits of the laddered parenthe- 
sis, Research Paper No. 127 (June 1971), The University of Calgary; see also this issue page 868. 

5. E. Netto, Lehrbuch der Combinatorik, Chelsea, New York, 1958, p. 192 ff. 

6. J. Touchard, Sur certaines équations fonctionnelles, Proc. Internat. Congr. Math., University 
of Toronto Press, Toronto, 1924, Toronto, 1928, 465-472. 


DETERMINATION OF THE RIEMANN FUNCTION 
E. J. Scotr, University of Ilinois 


1. Introduction. The Riemann method is often used to solve a Cauchy problem 
relative to the hyperbolic equation 


07u 


Ou Ou 
(1.1) ByOx + a(x, y) 3x + B(x, y) by + (x, yu = (x, y), 


1973] MATHEMATICAL NOTES 907 


where «, B,y are assumed twice continuously differentiable functions in some region 
R and 6 is continuous there. To do this one must first find the Riemann function 
v(x, y) which satisfies the adjoint equation of (1.1), namely, 
070 7) 
ayox ~ Ox 
x20, y2 0, 


(1.2) — (Bo) + yw = 0, 


the boundary conditions 


(1.3) LD — (x, 0)0(x,0) = 0, 


(1.4) ee — a(0, y)o(0, y) = 0, 


and the condition 

(1.5) v(0,0) = 1. 

This auxiliary problem, which is known as a Goursat problem, has a unique solution 
under the assumptions made above [2]. 


2. Use of the Laplace transformation. We shall show how, under appropriate 
conditions to be given subsequently, it is possible to solve the Goursat problem for 
the Riemann function given by equations (1.2) to (1.5) by means of the Laplace 
transformation [1]. To that end, let us take the Laplace transform of equation (1.2) 
with respect to x where 


L{(x,y)} = | * e-u(x, y)dx = H(s,)). 


Then, since 


Cn , di(s,y) _ dv(0, y) 
Oyox dy dy ’ 


L|--(a)| = § [ e~*(av)dx — a(0, y)v(0, y), 
Ox 0 
(assuming that Re(s) > 0 and a,v are such that lim,_,,, (av)e~** = 0), 
0 7) { ° 
L\—(pv)} = —— e~ *“(Bv)dx, 
[gre | = 5 | emipoy 


(assuming the validity of interchanging the operations of integration and differen- 
tiation), and 


Ly} = | e*(vax, 


908 E. J. SCOTT |October 


equation (1.2) is transformed into 


di(s,y) _ [ —sx 6 [ —Ssx 
Say S ; e~*“(av)dx ay J, e~**(Bv)dx 


4 [ e-*(yp)dx — eee x(0, »)0(0, »| = 0, 
0 y 
or, in view of condition (1.4), 
(1.6) 5 USN _ | e-**(av)dx 
‘¢) 


_o9 [* 
oy 


fo 6) 


e~**(Bv)dx + { e~**(yv)dx = 0. 
0 


Because of the presence of the products av, Bv, and yv appearing in the integrands 
of the last three terms, progress in determining the function o(s, y) is somewhat 
limited unless we make some further simplifying, but not unduly restrictive assump- 
tions on the functions a, B,y. To obtain a tractable equation involving i(s, y), we 
shall assume that «, f,y are functions of y alone. If this be so, then equation (1.6) 
becomes the first order linear differential equation 


(1.7) [s— Boy] AS + [a10) — 8’) — sayy], ») = 0 


whose solution is readily shown to be 


(1.8) v(s, y) = cexp [| aceree)| exp| — [ fen fes ac], 


where c is generally a function of s. To determine c we turn to condition (1.3) and note 
that if B is a function of y alone it becomes the differential equation 


(4.9) CC) — BO)u(,0) = 0 


the Laplace transform of which is 
sv(s, 0) — v(0, 0) — BO)xs, 0) = 0, 


or, in view of (1.5), 


1 
1.10 v(s,0) = ——.. 
(1.10) (8,0) = ae 
We conclude from equations (1.8) and (1.10) that 
(1.11) C= a 


s — BO)” 


1973] MATHEMATICAL NOTES 909 


Hence, 
Wt) — BD) — AT) BD) 

ol ) exp| - if s — B(x) ae 

v(s, y) = exp ; a(t)dt re; BO , 
whence, 

y(t) — B(x) — ~ HOP g 
4 -1 exp| - [2 Ss — - Bt) cr 

(1.12) v(x, y) = C€xp | { a(e)at| L se BO) 


From this formula we can obtain solutions to a number of specific problems. 
For example: If p(y) = a(y) f(y), then 


v(x, y) = exp [| a(e)ae| L-} soy | = eexp| [ater 


If y(y) 4 ay) By) and ay), BCy), y(y) are the constants «, B, y, respectively, then 


) — 
v(x, y) = ePL-* Pat = 0 P* Fo(2,/(y — aB)xy). 


In particular, if 

(a) «= B =y = 0, then v(x, y) = 1, 

(b) a = 0, 8 = 0, y 4 0, then v(x, y) = Jo(2/pxy), 

(c) «= 0, B #0, y ¥ O, then v(x, y) = e”J,(2./ypxy), 

(d) « # 0, 8 = 0,y # O, then v(x, y) = e”” Jo(2./ yxy). 

As another example, let o(y) = — 1, B(y) = y and y(y) = 2— y. Then, from 
(1.12) 


u(x, y) = 5 ~ 


exp[ — [ddt/(s — t) 
os fee 


= e PL"! {(s— y)/s*} = e (1 — xy). 


Remark. A similar analysis can be carried out if a(x, y), B(x, y), »(x, y) be assumed 
functions of x alone. The Laplace transform is then taken with respect to y. 


References 


1. G. Doetsch, Theorie und Anwendung der Laplace-Transformation, Springer-Verlag, Berlin, 
1937. 

2. A. N. Tychonov and A. A. Samarski, Partial Differential Equations of Mathematical Physics, 
Vol. 1, Holden-Day, San Francisco, 1964, page 114. 


INEQUALITIES FOR THE AREA OF TWO TRIANGLES 
L. Caruitz, Duke University 


Let a, b, c denote the sides of the triangle ABC and let a’, b’,c’ denote the sides 
of the triangle A’B’C’. Let F, F’ denote the respective areas. Pedoe [2] has pro- 
ved that 


(1) a*%(—a’? + b’* +c’) + b*(a’? — b'? +.c'*) +. c?2(a’* + b'? — c'”) = 16FF’, 


with equality if and only if the triangles ABC, A’B’C’ are similar. The writer [1] 
has recently given a simple algebraic proof of (1). 
Since 


F? + F’? > 2FF’, 
it is natural to ask whether 
(2) H 2 8(F? + F’*), 
where H denotes the left hand side of (1). We shall prove the following result. 
THEOREM. If the differences 
(3) a* —a'?, b? — b’*, c* —c’? 
are all positive or all negative and in addition the numbers 
(4) |a? — a’? |, |b? — b’? |, | c? — c’?|# 
form the sides of a triangle A (possibly degenerate) then 
(5) 8(F? + F’*)— H = 8F*( a), 
where F(A) denotes the area of A. Otherwise 
(6) H = &8(F* + F’?), 
Proof. Since 
16F? = 2b2c? + 2c?a? + 2a7b? — at — b* — c+, 
16F’? = 2b’2c’? + 2c’*a’? + 2a'*b’? — a’* — b'4# — 4, 
it is easily verified that 
(7) 16(F? + F’?) —2H =2 2 (a? —a'?)(b? — b’2)— Y (a? — a’)? 
Now assume that the differences (3) are all positive or all negative and put 
x= | a? —a'?|, y=|b? —b?|*, z= |c? — c12|F 


910 


MATHEMATICAL NOTES 911 


so that (7) becomes 

(8) 16(F? + F’?)—2H =2 Y x?y?- & x*. 

We recall that three positive numbers x, y, z form the sides of a triangle if and only if 
(9) 2 Lx*y?- Yxt2D. 


Moreover, the left hand side of (9) is equal to sixteen times the square of the area 
of this triangle. This evidently proves (5). 
If however 2 ) x*?y?— Y x*+ <0, it follows from (8) that 


(10) H > 8(F? + F’?). 
If z = 0, say, (8) reduces to 
16(F? + F’?) — 2H = 2x?y? — x*— y* <0, 


(unless x = y = 0, So that (10) holds in this case also). 
Finally assume that not all the differences are of the same sign. We may suppose 
that , 


| 
g 
I 


—u, b*—b’? =v, c? —~c? =w, 


where u 20, v20,w 


IV 


0. Then (7) becomes 
16(F? + F’*) —2H = —2uv — 2uw + 20w — u? — v? — w? 
= —(u+v—w)* —4uw <0, 


with equality only when u =0, v=w. 
This completes the proof of the theorem. 


Supported in part by NSF grant GP-17031. 


References 


1. L. Carlitz, An inequality involving the area of two triangles, this MONTHLY, 78 (1971) 772. 
2. D. Pedoe, Thinking geometrically, this MONTHLY, 77 (1970) 711-721. 


A BANACH SPACE CHARACTERIZATION OF THE SPACE OF 
AFFINE CONTINUOUS FUNCTIONS ON A COMPACT CONVEX SET 


P. D. TAYLor, Queen’s University at Kingston, Ontario, Canada 


A subset K of a linear space E is called convex if K contains, with any two of 
its points, the line segment joining them. A real-valued function f on K is called affine 


912 P. D. TAYLOR [October 


if graph(/) is convex. For example, any linear functional on E is, when restricted 
to K, affine. Algebraically we have 


K convex > x,yvEeK >tx+(1—-byeK Vt, 0 


IA 


t<1, 


f affine <> f(tx + 1—t)y) = tf(x) +U—-d)fG) Vt, OS t 


IIA 
~_—_ 


Now let’s put topology in the picture. Let E be a linear topological space, K a 
compact, convex subset and let A(K) denote the set of continuous affine functions 
on K. With the sup norm 


al = sup{| f(x)|: xe K}, 


A(K) is a real Banach space, that is, a real, complete, normed, linear space. 

Now ask the following question: Suppose we are given a real Banach space. 
How can we tell whether or not it is the space of affine continuous functions on 
some compact convex set? 

Let’s be precise. The correct notion of equivalence for Banach spaces is isometry. 
Two Banach spaces are isometric if there is a bijective linear map ® between them 
which preserves the norms: | O(a) | = | a |. Such a map is called an isometry. 
Now we can state our problem precisely. 

Given a real Banach space, find a property of the unit ball which will allow 
us to tell whether or not the space is isometric to the space of affine continuous 
functions on some compact convex set? 

Notice that we have asked for a characterizations in terms of the unit ball. 
The unit ball is the set of elements of norm less than or equal to one and it deter- 
mines and is determined by the norm. It is often more illuminating to think in terms 
of the unit ball rather than the norm. For example, an isometry is just a bijective 
linear map which sends the unit ball of one space onto the unit ball of the other. 

Let’s look at an example. Suppose K is an interval [a,b]. A member of ACK) 
is determined by its values at the end points of K. In fact the map f > (f(a), f(b)) 
is a linear bijection between A(K) and R”. If we give R* the sup norm, 
| x) | = max{|x|,|y|}, this map will be an isometry. 

Actually whenever A(K) is two-dimensional, K is an interval. In general, if K 
contains n affinely independent points, then clearly dim A(K) = n. So if dim A(K)= 2 
then K must be a compact convex subset of the real line, hence a closed interval. 

Now let’s solve our problem for two-dimensional Banach spaces. We have 
shown that any two-dimensional A(K) space is isometric to R* with the sup norm. 
What other norms can R? have, which are isometric to the sup norm? The unit 
ball of the sup norm is a square, so in terms of unit balls we are asking: what is the 
image of a square under a bijective linear transformation of R*? A parallelogram, 
of course! Let us define a parallelogram in a Banach space to be the convex hull 
of four points u,v,w and x, not all in a straight line, for which u+v=w+x. 


1973] MATHEMATICAL NOTES 913 


Then a two-dimensional Banach space is isometric to some A(K) if and only if 
the unit ball is a parallelogram. Indeed since any bijective linear image of a paralle- 
logram is a parallelogram, we have shown that the condition is necessary. On the 
other hand, any Banach space whose unit ball is a parallelogram, is by the obvious 
map, isometric to R* with the sup norm, hence to A(([0,1]). 

For another example suppose K is a triangle with vertices a, b and c. Then 
A(K) is the set of functions with planar graphs, and the map f — (f(a), f(b), f(c)) 
is a linear bijection between A(K) and R°>. Since | f | must attain its maximum at 
a vertex of K, this map is an isometry if we give R? the sup norm. This time the unit 
ball of R?° is a cube 

But now in three dimensions it is no longer true that any two A(K) spaces are 
isometric. For example, take K to be the square with vertices a, b, c and d (in cyclic 
order). Then A(K) is again the set of functions with planar graphs and the map 
f > (f(a), f(b), f(c)) is a linear bijection between A(K) and R*. But the sup norm 
on R° will not make this an isometry, since | f | need not attain its maximum at 
a, b or c. Since f((a + c)/2) = f((b + d)/2) we have f(d) = f(a) +f(c) — f(b) 
(since f affine), and hence the above map is an isometry if we give R° the norm 


x+z—y\}. 


| (x,y,z) || = max{|x],]y|,| 2], 


With this norm, the unit ball of R* is no longer the whole cube, but only that part 
of the cube between the two planes x +z—y = +1. We can visualize this unit 
ball by observing that the plane x + z— y = 1 cuts off the vertex (1, —1,1) from 
the cube by passing through the three adjacent vertices, (—1, —1,1), (1,1, 1) and 
(1,—1,—-1). Similarly the plane x + z—y = —1 cuts off the opposite vertex 
(—1,1,—1) by passing through its three adjacent vertices. This is certainly not a 
parallelopiped, so that this norm is not isometric to the sup norm. 

Let us recall our problem: to characterize the unit balls of A(K) spaces. The 
unit ball of the last example, although not a parallelopiped, is still ‘‘parallelogramic”’ 
in nature in the following sense: any plane passing through the vertices (1, 1,1) 
and (—1, —1, —1) cuts the unit ball in a parallelogram. This property can be trans- 
lated via the isometry to the A(K) space. The vertex (1,1, 1) corresponds to the 
constant function 1 and thus if K is a square, any two-dimensional subspace of 
A(K) containing the constant function 1 intersects the unit ball in a parallelogram. 
We can ask if this is true for any A(K) space. The answer is yes, and furthermore, 
this property characterizes A(K). 


THEOREM. Let A be a Banach space with unit ball B. Then A is isometric 
to A(K) for some compact convex K if and only if there is a point e in B with the 
property that every two-dimensional subspace of A containing e intersects B in a 
parallelogram with e as a vertex. 


914 P. D. TAYLOR [October 


Proof. If two spaces are isometric and one of them has the condition, so does 
the other. So to prove the necessity it is enough to show that A(K) has the condi- 
tion. Suppose K is compact, convex. Let e be the constant function 1 on K. If L 
is a two-dimensional subspace of A(K) containing e, choose f € Lsuch that || f || = 1, 
sup,(f) = 1 and f # e. Let m = inf, f. Then m<1_ since f # e. Let 


2 (m + 1) 
0 = Gow (1-3) 
Then gé&L and an easy calculation shows supxg = 1 and infyg = —1. Hence 


the parallelogram with vertices +e and +g has the property that every point 
on its boundary has norm 1. This parallelogram must then be the unit ball of L. 

To prove the converse we need to know a bit about Banach spaces. If A is a Ban- 
ach space we denote by A* the linear space of norm-continuous linear functionals 
on A. With the norm 


7 | = sup{| y(a)|: ae A, a | < 1} (y E A*), 


A* is a Banach space. By the weak * topology on A*, we mean the weakest topology 
in which the elements of A, considered as functionals on A*, are continuous. With 
this topology A%* is a linear topological space. A discussion of this topology can be 
found in Royden [3]. One of the basic results is that the unit ball of A* is compact 
in the weak * topology [3, Chap. 10, Theorem 15]. 

Now suppose A is a Banach space, and e is a point in the unit ball B with the 
property of the theorem. If dim A = 1 then A is isometric to A(K) for any K con- 
taining only one point, and we are finished. So suppose dim A = 2. Then A con- 
tains a two-dimensional subspace L such that e is a vertex of the parallelogram 
LOB. It follows that ! e | = 1. Let 


K= {pe A*: | y | = 1 and y(e) = 1}. 


First notice that K is a weak *closed subset of B*, the unit ball of A*. Indeed, 
any y in the closure of K must have y(e) = 1 and | y | < 1. Since e | = 1 we have 
| y | = 1, and hence ye K. Since B* is weak * compact, so is K. Notice next that 
K is convex. Indeed any y on the line segment between two points of K must have 
y(e) = 1 and || y|| S 1. Again since |] e|| = 1 we have ||y|| = 1 and yeXK. So that 
K % a compact convex subset of a linear topological space. 

Since members of A, if considered as functionals on A* are linear and weak 
* continuous, we have a natural linear map ®: A > A(K), by restricting members 
of A to K. For aéA let 


! a I x = sup{y(a): ye K}. 


We shall have shown that ® preserves norm if we show that | a | = ! a ! x: It will 


1973] MATHEMATICAL NOTES 915 


be enough to show this whenever ||a|| = 1 and a # +e. In this case let L be the 
two-dimensional subspace of A spanned by a and e. Since the unit ball of L is a 
parallelogram with e as a vertex, and a in the boundary, we can find 16 L* such 
that A(e) = 1, |A(a)| = 1 and |A|| = 1. Use the Hahn-Banach Theorem [3, Chap. 
10, Thm. 4], to choose ye A* with norm 1 and extending 1. Then yeK and 
ly(a)| = 1 from which | a lx = 1.Since K c B*, lla lk < land | @ IIx =l= | a |. 
Thus ® preserves norm, and is also injective. 

It remains to show that ® is onto. It will be enough to show that ®(A) is norm- 
dense in A(K). Indeed A is complete and ® preserves norm, so that ®(A) is complete 
and hence norm-closed in A(K). 

To show that ®(A) is norm-dense in A(K), we need two results from linear 
topological spaces. First, if M is such a space and we give its dual space M* the 
weak * topology, what is the space of continuous linear functionals on M*? Clearly 
any member of M, regarded as a function on M*, is weak *continuous (by defini- 
tion of the weak * topology!). The result is that there are no others: the space of 
weak *continuous linear functionals on M* is precisely M. The proof is short, 
but tricky, and can be pieced together from Kelly, Namioka [1, 17.6]. Secondly 
we need to know that if K is a compact convex subset of a linear topological space 
E then the linear span of the constants and the members of E*, restricted to K, is 
norm dense in A(K). This is an application of the separating-hyperplane theorem, 
and can be found in Phelps [2, Prop. 4.5]. 

Now we have what we need. By the first result, A is the dual space of A* with 
the weak * topology. Since K is a compact convex subset of A* with this topology, 
the second result tells us that ®(A) together with the constants, spans a norm-dense 
subspace of A(K). Since (A) already contains the constants, we are finished. 


References 


1. J. L. Kelly and I. Namioka, Linear Topological Space, Van Nostrand, Princeton, N. J., 
1963. 
. Phelps, Lectures on Choquet’s Theorem, Van Nostrand, Princeton, N. J., 1966. 
. Royden, Real Analysis, Macmillan, New York, 1966. 


ON LATTICE POINTS INSIDE CONVEX BODIES 


A. M. Optyzko, California Institute of Technology 


If K is a convex body in n-dimensional Euclidean space, let V(K), A(K), and 
N(K) denote, respectively, the n-dimensional volume of K, the (n— 1)-dimensional 
surface area of K, and the number of lattice points (points with integer coordinates) 
in the interior of K. For each positive real number «a, let 


(a,n) = min{N(K): V(K)/A(K) 2 «}. 


916 A. M. ODLYZKO [October 


Hadwiger [2| has recently proved that for all n = 2 we have 


(1) (4,n) 2 1, 


while 
Ka,n) = 0 for all a<4. 


Essentially the same result had been proved previously by Bender [1] for n = 2, 
and Wills [5], [6] has shown that (1) holds for n = 2, 3, and 4. 
It follows from (1) and the superadditivity of [(a,n) as a function of « that 


(2) K(a,n) 2 [2a] for n = 2, 


where [| y | is the greatest integer < y (see [3| and [7]). This lower bound was raised 
in the two-dimensional case by Hammer [4], who used Bender’s result to show 
that if r is a non-negative integer, then 


I(2",2) = 2"**-1. 


Combined with the superadditivity of [(a,n) that result gives an improved estimate 
of I(«,2) for general «. This note proves the much stronger and more general result 
(3) K(a,n) = [2«]" for n 22. 
We also show that 
(4) la.) So 

7" = T((n/2) + 1) 


where I is the gamma function. 
To prove (3) consider any convex body K such that V(K)/A(K) 2 a, and let 
r = [2a]. We can assume that r = 1, since otherwise (3) is trivial. Define 


(na+./n)" for n= 1, 


K’' = “K = [exe x}. 


Then 
V(K’) = SVK) and A(K’) = ar ACK). 


Now let a = (d,,-::,a,) be any n-tuple of integers such that 0 S a; < r—1 for 
i = 1,---,n and consider 
1 1 
K'~—--a= [2 —"a:zeR’}. 
r r 


Since K’ — (1/r)a is a translate of K’, it has the same volume and surface area as 
K’. In particular 


1973] MATHEMATICAL NOTES 917 


V(K'—(/na) VK) _ 1 VK) a 
A(K’~(i/r)a) —s A(K’) ~—s rr: A(K) = [2c] 


2 
—~ 2 
Hadwiger’s theorem (inequality (1)) now asserts that K’ — (1/r)a contains at least 
one lattice point. This means that K’ contains a point of the form b + (1/r)a, where 
b = (b,,--:,b,) is a lattice point, which implies, by the definition of K’, that K 
contains the point rb + a. Thus for each one of the r" specified choices of a=(a,,°*:,a,) 
there is a lattice point ¢ = (c,,--:,c,) in K such that c; = a;(modr) for i = 1,---,n. 
Since the a; satisfy 0 < a; < r—1, all these r” lattice points c in K are distinct, and 
therefore N(K) 2 r". This proves (3). 

To prove (4) we utilize the well-known method of estimating the number of 
lattice points in a sphere. Let S(x) be the n-dimensional unit sphere of radius x, 
centered at the origin. Then V(S(an)) = @,(an)" and A(S(an)) = nw,(an)"~*, 
where 


gh! 2 


" TM@2+D 


is the volume of an n-dimensional unit sphere, and hence V(S(an))/A(S(an)) = «@. 
Now with each lattice point (a@,,---,a,,) belonging to the interior of S(an) we associate 
the unit cube {(yysctts Yu) | < y, S a,;+1}. The volume of the union of these 
cubes is N(S(an)) and they are all contained in a sphere of radius on + /n with 
center at the origin (note that ./n is the length of the main diagonal of the unit 
cube). Therefore 


@ 


N(S(an)) S$ @,(an + Jn)", 


and this implies the upper estimate (4). 

Although the lower estimate (3) is probably not the best possible, the bound (4) 
shows that as a function of « it is of the right order of magnitude. It seems likely, 
however, that (4) gives a better approximation to I(a,n) than (3); in fact, we con- 
jecture that for each fixed n = 1 we have 


2 
re"! 


Mon) = TG £1 


na"l+o(1)) as a7 o. 


Note added in proof. For further results, including a proof of the above conjecture, see J. Bokow- 
ski and A. M. Odlyzko, Lattice points and the volume 1 area ratio of convex bodies, to appear in 
Geometria Dedicata. 


Acknowledgement. I would like to thank Professor T. M. Apostol for his help in the preparation 
of this paper. 


References 


1. E. A. Bender, Area-perimeter relations for two-dimensional lattices, this MONTHLY, 69 (1962) 
742-744. 


918 GERALD FREILICH {October 


2. H. Hadwiger, Volumen und Oberflache eines Eikérpers, der keine Gitterpunkte tiberdeckt, 
Math. Zeit., 116 (1970) 191-196. 

3. J. Hammer On a general area-perimeter relation for two-dimensional lattices, this MONTHLY, 
71 (1964) 534-535. 

4. , some relatives of Minkowski’s theorem for two-dimensional lattices, this MONTHLY, 
73 (1966) 744-746. 

5. J. M. Wills, Ein Satz tiber konvexe Mengen und Gitterpunkte, Monatsh. Math., 72 (1968) 
451-463. 


6. , Ein Satz tibe1 konvexe KO6rper und Gitterpunkte, Abh. Math. Sem. Univ. Hamburg, 
(1970) 8-13. 
7. , On lattice points and the volume/area ratio of convex bodies, this MONTHLY, 78 


(1971) 47-49. 


INCREASING CONTINUOUS SINGULAR FUNCTIONS 
GERALD FREILICH, Queens College, (CUNY) 


The Cantor function (sometimes referred to as the Lebesgue function) is the 
standard example of a monotone continuous singular function; though monotone, 
it is not strictly increasing. An ingenious example of a strictly increasing function 
with these properties is given in Riesz and Sz.-Nagy [2, pp. 48-49] and in Hewitt 
and Stromberg [1, pp. 278-282]. In this note, another method of constructing 
such functions is given. 

We first review the definition of the Cantor function. Let S denote the Cantor 
ternary set defined by 


00 

S = [ 3-"a,: a, = 0 or 2 for all nf. 
n=1 

Ifx = 2°,3 "a, and if for alln, a, = 0 or 2, then the Cantor function assigns 

to x the value C(x) = £°_,2-"-*a,. For 0S y $1, the Cantor function as- 

signs to y the value 


C(y) = sup{C(x):xeS and x S y}. 


We shall let C denote the usual Cantor function extended to the domain (— 0, 00) 
by defining C(x) = 0 if x < 0 and C(x) = 1 if x 2 1. It is well known that C is 
nondecreasing, continuous, and that C’ = 0 almost everywhere. 

We now give a simple construction of an increasing function with the above 
properties along with a simple proof of these properties. 


CONSTRUCTION: Let A = {a,,a,,---} be any countable set that is dense in the 
reals. Define 


fx) = % 27°C" - a,)), 


1973] RESEARCH PROBLEMS 919 


ASSERTION. The function f is continuous, strictly increasing, and singular. 


Proof. The uniform convergence of the sum defining f implies that f is contin- 
uous. If x, < x,, choose a, so that x, <a, <x,. Then 


C(2"(X1 — Gy) > 0 = C2" x2 — aq)) and C2™(x1 — 4,)) 2 C2"(x2 — An)) 


for m # n, so that f(x,) >/f(x,). By Fubini’s theorem on differentiation of series 
with monotonic terms [2, pp. 11-12], 


f(x) = XZ C’'(2"(x — a,)) = 0 for almost all x. 


References 


1. E. Hewitt and K. Stromberg, Real and Abstract Analysis, Springer Verlag, New York, 1965, 
MR 32 + 5826. 
2. F. Riesz and B. Sz.-Nagy, Functional Analysis, Ungar, New York, 1955, MR 17, 175. 


RESEARCH PROBLEMS 
EDITED BY RICHARD GUY 


In this Department the Monthly presents easily stated research problems dealing with notions 
ordinarily encountered in undergraduate mathematics. Each problem should be accompanied by 
relevant references (if any are known to the author) and by a brief description of known partial 
results. Manuscripts should be sent to Richard Guy, Department of Mathematics, Statistics, and 
Computing Science, The University of Calgary, Calgary 44, Alberta, Canada. 


QUESTIONS ON A SEQUENCE OF ULAM 
BERNARDO RECAMAN, Colegio San Carlos, Bogota, Colombia 


Ulam [1] introduced sequences of positive integers constructed in the following 
way: ‘‘Let u, and u, be given integers; we construct an increasing sequence of in- 
tegers by adjoining those which can be represented in just one way as the sum of 
two distinct preceding members of the sequence.’’ I am concerned with the case 
u, = 1, u, = 2 and shall call members of this sequence U-numbers. That there are 
infinitely many U-numbers, may be seen by supposing that they form a finite set; 
then-the sum of the two largest would be a U-number not in the set, contradicting 
our supposition. The first few U-numbers are 


1,2,3,4, 6,8, 11, 13, 16, 18, 26, 28, 36, 38, 47, 48, 53, 57, 62, 69, 72, 77,82, 87, 97, 99, ... 


From the many questions that can be asked about U-numbers, I choose the following: 
1. Apart from 1 + 2 = 3, can the sum of two consecutive U-numbers be a U-num- 
ber? It cannot be the next U-number, as R. B. Eggleton (written communi- 


920 WALTER MEYER {October 


cation) observes, since if u,+u,—; =U,41 iS a unique sum of U-numbers, 
so is u, +u,—2, Which lies between u, and u, +u,-,. 

2. Are there infinitely many numbers (such as 23, 25, 33, 35, 43, 45, 67, 92, 94, 96---) 
which are not the sum of two U-numbers? 

3. (The problem with which Ulam introduced the sequences.) Do the U-numbers 
have positive density? In other words, if U(x) is the number of U-numbers less 
than x, is liminf U(x)/x>0? The argument given above shows that U(x) > log,x 
since each U-number is less than twice the preceding; in fact the observation 
made under question 1 implies that u,,, Su, +u,—2 and improves the estimate 
to U(x) > log,x where « is the real root of x3 — x? —1 = 0. 

4. Are there infinitely many instances in which two consecutive integers are both 
U-numbers, such as 47 and 48? Clearly, three consecutive integers greater 
than u, = 2 cannot all be U-numbers. 

5. Are there arbitrarily large gaps in the sequence of U-numbers? 


Reference 


1. S. M. Ulam, Problems in Modern Mathematics, Interscience, New York, 1964, p. ix. 


EQUITABLE COLORING 
WALTER Meyer, Adelphi University 


Recently the notion of vertex colorings for graphs has proved useful in an opera- 
tions research context [2]. In the application in question, vertices represented garbage 
collection routes and two such vertices were joined when the corresponding routes 
should not be run on the same day. The problem of assigning one of the six days 
of the work week to each route becomes the problem of six-coloring the graph. 
On practical grounds it might also be desirable to have an approximately equal 
number of routes run on each of the six days. This suggests: 


DEFINITION. Suppose the vertices of a graph G are colored with p colors so that 
no edge joins vertices of the same color and the cardinalities of the color sets differ 
by at most one. Then G is said to be equitably p-colored. If p is the least integer n 
for which G can be equitably n-colored, then p is the equitable chromatic number 
of G, denoted by y,(G). 


For example, for the ‘star’ graph K,,,, in which one vertex is joined to each of 
n others, y,(K,,,) = 1 + {n/2}, where {x} denotes the least integer not less than x. 

The usual coloring methods do not seem to go far in studying equitable coloring, 
but we offer the following result. 


THEOREM. If T is a tree with maximum valence A(T), then T can be equitably 
colored with {A(T)/2} + 1 colors. 


1973] RESEARCH PROBLEMS 921 


Proof. The proof is by induction on k, the number of vertices. The least value 
of k is A(T) + 1 and this occurs only for K,,,. By our earlier observation, the base 
of the induction is secure. 

If k > A(T) + 1 then it can be shown there will exist a vertex v with these prop- 
erties: 

1. val(v) 2 2, 

2. all but one, say x, of the vertices adjacent to v are 1-valent, 

3. there exists a vertex v’ of T, v' 4 v, where val(v’) = A(T). 

We shall denote val(v) by t. Thus, the vertices other than x which are adjacent 
to v are X1,X2,°°:,X;_, (Figure 1). 


X14 


Fic.. 1 


We prune the tree T by removing edge xv, thereby detaching a star with center v 
and leaving a tree T’ which can be equitably colored with {A(T)/2} +1 colors by 
the inductive hypothesis. 

Before proceeding further, it is convenient to introduce the color balance vector 
for a coloring of a graph with c colors. If w is the least cardinality of a color class 
in a c-coloring of a graph G, then the color balance vector of the coloring is 
(b,, b,,---, b.) where b, is the cardinality of the ith color set minus w. Thus a necessary 
and sufficient condition for a coloring to be equitable is that its color balance vector 
has only 0 and 1 as entries. 

Now as we restore the pruned star to T’ and recreate T, it is convenient to keep 
track of the color balance vector as we proceed to color the restored vertices. There 
are various cases according to the color balance of the coloring of T’. With no loss 
of generality one can assume in all cases that x is colored with the second color. 
The style of reasoning is similar in all cases, so we prove only the third case. 


CASE 1. b; = 0 for all i #2. 


CASE 2. There is an i # 2, where b, # 0 and, in addition, the number of zero 
b,, denoted z, is greater than or equal to t. 


922 E. J. MCSHANE [October 


CASE 3. There is an i 4 2(i=1 say), where b, #0. Also zSt—1. 


In case 3, color v with the first color and then use each color i for which b; = 0 
once to color one of the x;. This produces the color balance vector (1, 0,0, ---, 0). 
However, unless z = t— 1 not all the x; have been colored. If uncolored vertices 
remain, we can run through the colors 2,3,---,{A(T)/2} +1 twice more without 
producing an inequitable color balance. Clearly this suffices to color each remaining 
X,, since the number of these is < ¢ and t < A(T) S 2{A(T)/2}. 

This theorem shows that, in general, y,(G) is not directly related to 7(G). A bound 
for x.(G) involving A(G) seems more reasonable. We offer the following conjecture 
whose validity has been verified for all graphs with 6 or fewer vertices. 


CONJECTURE. ¥,(G) S A(G) for all connected graphs G except the complete 
graphs and the odd circuits. 


A reasonable place to begin on this conjecture might be with graphs where 
A(G) = 3, for then a spanning 3-tree might already require 3 colors for an equitable 
coloring. Also, from the form of the conjecture, one is led to seek a connection 
with Brooks’ Theorem [1]. 


Editorial Note: It follows from a result of Hajnal-Szemerédi [3] that 
XAG) SA(G) + 1. 


References 


1. R. L. Brooks, On colouring the nodes of a network, Proc. Cambridge Philos. Soc., 37 (1941) 
194-197; MR 6-281. 

2 A. C. Tucker. Perfect graphs and an application to optimizing municipal services, SIAM- 
Review, 15 (1973). 

3. A. Hajnal and E. Szemerédi, Proof of a conjecture of P. Erdés, in Combinatorial Theory and 
its Applications, Collog. Math. Soc. Janos Bolyai, 4, (1970) 601-623. 


CLASSROOM NOTES 
EDITED BY ROBERT GILMER 


Material for this Department should be sent to David Roselle, Department of Mathematics, 
Louisiana State University, Baton Rouge, LA 70803. 


THE LAGRANGE MULTIPLIER RULE 
E. J. MCSHANE, University of Virginia 


Because of its many uses, the Lagrange multiplier rule is presented in many 
advanced calculus texts. But the proof is customarily based on the implicit functions 
theorem, which to many undergraduates appears sophisticated. Moreover, as L. C. 


1973] CLASSROOM NOTES 923 


Young remarks, the theorem for extrema subject to inequality constraints should be 
in the advanced calculus texts of the future. We here present a proof of the latter 
theorem that uses, beyond quite elementary calculus, only the theorem (Bolzano- 
Weierstrass) that from a bounded sequence of points in real n-space R"” one can 
always extract a convergent subsequence, and the theorem that a real function f 
defined and continuous on a closed ball {xe R": | x ~ a| <r} in R" attains its 
minimum at some point in that ball. Even these two theorems are proved, e.g., in 
Bers’ Calculus, pp. S4 and S6. The simpler proof of the theorem, as usually stated, 
is obtained by removing all references to inequality constraints. 

The coordinates of a point x in R" will be denoted by x“, ---,x. If fis defined 
on a set Gin R” and aeéG, the partial derivative at a of (f(x): x € G) with respect to 
the jth coordinate will be denoted by D, f(a), if it exists. Also, as usual, 


Vf(a) = (D, f(a), -++, D, f(a); 
and 
f*(x) = max (f(x), 0). 


THEOREM. Let f, g1,°°'5 9% N1,°°:,h, be defined and continuous, together with 
their partial derivatives D,f, Djg,,---,Djg,, Djhy,--, Djh, G =1,-::,n), on a set G 
in R". Let Xq be an interior point of G at which (i) the conditions 


(1) gi(x) = 0 (i= 1,-+-,k); h(x) $0 (r= 1,---,S) 


are satisfied, and (ii) f(Xo) S$ f(x) for all x in G that satisfy (1). Then there are 
numbers io, 415 °°+s Ags Mos ***> ts not all 0 such that 


k Ss 
(2) AoD jf (Xo) + 2 A;D j9i(Xo) + 2 L,D jh,(Xo) = Q, (j =1,--:,n). 


r= 
Moreover, 


Gi) Ag ZzOand p= O0(r=1,--,5); 

(ii) for each r such that h,(x9) < 0, py, = 0; 

(iii) if the g; and h, satisfy the “‘Kuhn-Tucker constraint condition’’ that the 
gradients at Xo of the g; and those h, for which h,(x,) = Oare linearly independent 
pectors, it is possible to choose A, = 1. 


Without loss of generality we assume that x, = (0,0,---,0); that f(x.) = 0; and 
that 


hy(Xo) = h(Xo) — 0, h(Xo) < 0 (r =Z+ I, “+58). 


We choose a positive e, such that the closed ball B(e,) = {xe R": |x| < e,} is con- 
tained in G and h,, ,(x),...,h,(s) are negative on B(é,). We first prove 


924 E. J. MCSHANE [October 


(3) To each ¢ such that 0<éS 6, there corresponds an N such that 
k Zz 
f(x) + |x|? +N { > g(x)? + Dd [nF | > 0 
i=1 r=1 


for all x such that | x| = &. 


Suppose false. Then there exist numbers N,, N>,--- tending to oo and points 
X4,X2,°°: With | Xn | =e such that for all m 


k 4 
(4) I (Xm) + | Xn |? Ss —N,, Dy G(X_)? + d Ht Gn)? 
i=1 r=1 
A subsequence of the x,, converges to a point x*; without loss of generality we may 


suppose this subsequence to be the whole sequence. Then | x* | = lim | x,, | =e and 
I (Xm) ~f(x*). By dividing both members of (3) by —N,, and letting m — oo we obtain 


Ma 


g(x*? + Lh, (x*)? =0. 
r=1 


t=1 


Therefore x* satisfies (1), so lim f(x,,) =f(x*) = f(x 9) = 0. But by (4) f(x,,) S — &. 
This contradiction establishes (3). 
We next prove 


(5) To each e¢ such that 0<eSe, there corresponds a point X and a unit vector 
Aosdis°tts4ns List'*sH,) with non-negative Ao, Uy,°°:, u, Such that |x| <eand 


Aol Dj f(X) + 22°] + E A;D; 9X) + z HD h(X) =O (f= 1,++,n). 
With the N of statement (3), we define F on G by 
F(x) = f(x) + | x|? +N [ > g(x)? + p nox). 
There is a point X in the closed ball B(e) at which F assumes its least value on B(e); 
then F(x) S$ F(0) = 0, so by (3) we cannot have | x | =. So X is interior to B(e), and 


all first-order partial derivatives of F must vanish at x. At X the function [h; |? has 


derivatives 
2 h,"(x)D,h,(x), 
obviously if h(x) > 0 and because [h, ]? vanishes at least to second order where 


h; (x) = 0. So for j = 1,---,n we have 


k Z 
(6) D,f(%)+2%% + LY 2NgA(x)Dj9(X) + LX 2Nh*(x)Djh,(x) = 0. 
i=1 r=1 


1973 | CLASSROOM NOTES 925 


Define 

k Zz + 

L = fF + 2d [2Ng(x)]? + & Bole 
i=1 r=1 

Ag = 1/L, 

A, = 2Ng(Xx)/L (i =1,---,k), 

ub = 2Nh;' (x) /L (r = 1,-++,z), 

U, = 0 (r=zt+1,--,5). 


Then (Ap, 445 °**5 Ags Lys °°*> Ls) IS @ Unit vector, and A, and the pw, are non-negative, 
and if we divide both members of (6) by L we obtain the equation in (5). 

Now choose positive numbers ¢, > 6, >6,>-:: tending to 0. For m = 1,2,::: 
we choose a point x,, with |x,,| <,, and a unit vector (Ao jms Atv **"s Hes» O5°**3 0) 
with non-negative A, ,,, and y,,,, such that the equation in (5) (with obvious notational 
changes) holds; this is possible, by (5). We choose a subsequence for which the unit 
vectors converge to a limit Ag, Aq, +++, Ags Mis °**s Us). Since X,, > Xo, the equation in (5) 
holds with this limit-vector and with x, in place of x. The theorem is established 
except for conclusion (iii). 

If the Kuhn-Tucker constraint condition holds we cannot have /, = 0, since then 
(2) would contradict the linear independence of Vg,(Xo),°::, Vai(%o), Vhy(Xo)s °*'> 
VhA%o). SO Ag > O, and the multipliers (1,2, /Ag,-+-, Ag /Aos My [Ags +**> bs /Ao) Satisfy 
all the requirements. 


THE MINI-MAX PROPERTY OF THE TYCHONOFF PRODUCT TOPOLOGY 
D. E. CAMERON, University of Akron 


In 1923 Tietze [5] defined a topology for the Cartesian product of arbitrary 
topological spaces; this topology is often referred to as the box topology. The ac- 
cepted definition for the product topology was given by Tychonoff [6] in 1930; 
it agrees with Tietze’s topology on finite products but differs on infinite products. 
Whatever definition of the product topology is used, it is essential that for finite 
products it agree with the product topologies of both Tietze and Tychonoff because 
this definition is in keeping with the relation between the usual topology of the reals 
and the topology of Euclidean n-space. In this note we shall give a justification for 
the acceptance of the Tychonoff definition. 


DEFINITION 1 (The Tychonoff Product Topology): If (X,, T,) for «in A are topolog- 
ical spaces, the product topology X{T,:a«¢ A} on X<{X,: ae A}the Cartesian product 
of the X,’s for « in A—is the topology which has as its subbase all sets of the form 


926 D. E. CAMERON [October 


[].*(U,), where U, isin T, for «in A and [],: X{X,:«¢A} > X,is the projection 
mapping for each « in A. 


This definition differs from that of Tietze in that Tietze permitted arbitrary 
intersections of sets of the form ||, ‘(U,) for his base while Tychonoff’s definition 
permits only finite intersections of the same sets for its base. 


DEFINITION 2. If Tand T* are two topologies on a nonempty set X, Tis said 
to be coarser than T* if T c T* (T* is finer than T). 


One sees by comparing the two definitions that Tietze’s topology is finer than 
Tychonoff’s on infinite products but that they are identical on finite products. Ty- 
chonoff’s definition achieved its acceptability primarily because of the following 
result. 


THEOREM 1. The Tychonoff product topology is the coarsest topology on the 
Cartesian product for which all projection mappings are continuous [7, p 54-55]. 


The argument to be considered here for the acceptance of the Tychonoff def- 
inition is that if the Tychonoff product topology were replaced by any other product 
topology then one of the following two theorems would be invalid. 


THEOREM 2. The Tychonoff product topology is a Hausdorff topology if and 
only if each coordinate space is a Hausdorff space |7, p. 87]. 


THEOREM 3. The Tychonoff product topology is a compact topology if and 
only if each coordinate space is a compact space {7, p. 120]. 


The latter theorem is called the Tychonoff Product Theorem and is considered 
by many topologists to be the most important theorem in general topology. One 
reason for its importance is the fact that it is equivalent to the axiom of choice [3, 
p. 33 and p. 143-144; 4]. 


DEFINITION 3. Let R be a topological property, X a nonempty set. A topology 
Ton X with property R is said to be maximal R (minimal R) if for any finer (coarser) 
topology T* on X, T* does not have property R. 

Theorem 2 and Theorem 3 are related by the following theorem due to E. Hewitt 
[2, p. 320]. 


THEQREM 4. A compact Hausdorff space is maximal compact and minimal 
Hausdorff. 


By using the product theorems previously stated, we see that the Tychonoff 
product of compact Hausdorff spaces is maximal compact and minimal Hausdorff. 
Therefore if the product topology used is coarser than the Tychonoff topology, 
products of Hausdorff spaces would not necessarily be Hausdorff. Also products of 


1973] CLASSROOM NOTES 927 


metric spaces would not be metric spaces as can be seen by taking suitable products 
of [0,1] (the closed unit interval) with the usual topology which is a compact metric 
space and recalling the fact that all metric spaces are Hausdorff. Another property 
which would not be productive is T,-complete regularity since M. P. Berri [1] has 
shown that the minimal 7,-completely regular topologies are precisely the compact 
Hausdorff topologies. 

If the product topology used were finer than the Tychonoff topology (as in the 
case of the Tietze topology) then the products of compact spaces would not neces- 
sarily be compact. Therefore the Tychonoff product topology may be considered 
as being mini-max, a strong argument for its acceptance. 


References 


P. Berri, Minimal topological spaces, Trans. Amer. Math. Soc., 108 (1963) 97-105. 
Hewitt, A problem of set-theoretic topology, Duke Math. J., 10 (1943) 309-333. 
L. Kelley, General Topology, Van Nostrand, Princeton, N. J., 1955. 
L. Kelley, The Tychonoff product theorem implies the axiom of choice, Fund. Math., 
37 (1950) 75-76. 

5. H. Tietze, Uber Analysis Situs, Abhandl. Math. Sem. Univ. Hamburg, 2 (1923) 27-70. 

6. A. Tychonoff, Uber die topologische Erweiterung von Raumen, Math. Ann., 102 (1930) 
544-561. 

7. S. Willard, General Topology, Addison-Wesley, Reading, Mass., 1970. 


1. M. 
2. E. 
3. J. 
4. J. 


ANOTHER APPROACH TO THE CUBIC INTERPOLATING SPLINE 
B. H. RosMAN, Framingham State College 


1. Introduction. A topic that is permeating numerical analysis courses at all 
levels is the problem of constructing a cubic spline which interpolates to data values 
at the joints. In this note two common approaches to this problem are surveyed and 
a new variant to one of these approaches is discussed. A cubic spline s(x) on [a, b] 
with n — 1 interior joints a = X9 <x, <X,<-++:<X,_, < xX, = bisa C’ function 
such that on [ x;_,,x;]|, s(x) is a cubic polynomial, i.e., s(x) is a piecewise cubic with 
extra smoothness requirements at the joints. 


2. The interpolation problem. The problem discussed here is: given data 
values f; at the x;, i= 0,1,2,---,”, find a cubic spline s(x) (with joints at the x;) such 
that s(x;) = f,; i =0,1,---,n. It follows easily from the above definition that any 
cubic spline has the form 

3 n-1 
(1) s(x) = Lax'+ Le(x—%)i, 
i=0 k=1 


for some choice of real values a;, c,, where 


928 B. H. ROSMAN [October 


0O,x — x, <0 
oxy = | 

(x — x,)°, x —x, = 0. 
From (1), we see that this is a linear problem with n + 1 equations (s(x,;) = f;) and 
n+ 3 unknowns (a;, c,). To remove two degrees of freedom, end conditions are 
imposed, The most common conditions are (A): s’(x;) = f/; i = 0, or (B): s”(x;) = 0; 
i = 0,n, where f;’ is f’(x;), for the function to be fitted, f(x), or, if we are fitting 
discrete values, f;, is just another data value. 


3. Methods of solution. For computational reasons, the system (1) is usually 
not solved directly. Often another linear system is derived which gives the spline 
locally, i.e., the correct cubic is derived for each cell [ x;_,,x;]. The usual approaches 
for doing this differ according to whether end conditions (A) or (B) are given. 

For condition (A) the approach is as follows (see [2, chap. 1], [3, chap. 4] for 
details). Using Hermite interpolation formulas a cubic p,_ ,(x) is derived in [ x;_ 1, x;] 
which takes on the values p;_,(x,;) = f;; and pj_,(x,;) = s;; j = i — 1,i. Similarly, 
in [X;,X;41], p(x) is derived such that p,(x;) = fj; p\(x;) = sj, j= i, i+ 1. Since 
we have imposed four conditions in each cell, this means that once we have determined 
the s; values, we know (by substituting into each local Hermite cubic) the spline in 
each cell. To get an easily solvable system for the s; values, we compute p;_ ,(x;) and 
Pp; (x;) and equate the results (since we impose C* smoothness). Doing this for 
i= 1,2,---,n— 1, we get a tridiagonal strictly diagonally dominant linear system 
with conditions (A) used to eliminate the unknowns s4, s, ([2], [3]). 

When conditions (B) are imposed, a different method of derivation is usually 
given (see [1, chap. 2], [4, chap. 1]). In each cell [ x,;_,,x;], s’(x) is a straight line of 
the form 


" " xX — X; " " 
(2) s"(x) = Sia + h (8; Si-1), 


where s;, s;_,; are (as yet unknown) values of s”(x) at x;, x;_, and h = x; — X;-,. 
To get from (2) to a local formula for s(x), two integrations are performed, more 
constants are introduced (the s; and f, values) and by algebraic manipulation, the 
s; values are eliminated and a tridiagonal strictly diagonally dominant system is 
obtained for the s/ values with conditions (B) used to eliminate so, s;. When this is 
solved s(x) is then known in each cell. 


4, Another approach. Although the two approaches lead to similar linear 
systems the method used for conditions (B) is more cumbersome. Some of my 
students and I have wondered why the more direct approach for conditions (A) 
cannot be used for conditions (B). The answer is it can, and the solution furnishes 
an instructive example of interpolation at work. 

The idea is to find a cubic p(x) such that p(x;) = f,; p"(x;) = fy, i = 0,1, where 
hi, J; are arbitrary values. Then the approach for condition (A) can be imitated 


1973] CLASSROOM NOLES 929 


leading directly to the linear system associated with conditions (B). (Here, we equate 
the first derivatives at x;.) The interpolation problem posed here is a special case of a 
Hermite-Birkhoff interpolation problem, i.e., there may be gaps in the data conditions 
(the required derivatives of p(x) at a given point are not consecutive) and as such 
may not possess a solution. (Specifically, consider n ordered pairs (i,j), where i,j 
are integers wih | Sisgk<sn,O0OSjsn-—lI1. Let x; <x,<--+ <x, be any set 
of numbers and for each of the (i,j), let y/ be a given data value. The Hermite- 
Birkhoff interpolation problem is: find a polynomial p(x) of degree at most n — 1 
such that p\(x,) = y! for each (i,j).) However, here we do have a unique solution 
for by explicit computation the value of the determinant of the system for the co- 
efficients a, of the cubic p(x) = Xj_,a;x' has the value — 12(x, — x,)?. (The deter- 
minant is derived from the conditions p(x;) = f;; p"(x;) =f,” at the distinct points 
Xo. %X1-) 

The next step is to get a Lagrange type formula for the interpolating cubic. 
To this end we want cubics po, Py, do, q,, Such that po(x) = pi(x) = 0 at x = Xo, 
X,° P(X;) = 6, jf = 0,1; and qo(X) = qi(x) = 0 at x = Xo, X1, Gi (X;) = Oyj, 
j = 0,1, where 6;; is the Kronecker delta. For then we can write p(x) as 


1 1 
(3) P(x) = 2% Ki) + 2 fi gil). 


To get po, p;, note that po is a linear polynomial with two zeroes; hence pg = 0 so 
that p,(x) is just a straight line passing through the points (x9, 1), (x,,0). The same 
reasoning applies to p,, so that 

x Xj xXx—- Xo 


5 xX) = . 
Xp x? PAO) X1— Xo 


(4) Po(x) = 


To get qdo,q1, note that qo has zeroes at xo, x; ; so we get that qo(x) = (x — Xo) 
(x — x,)(ax + b), where a, b are picked to satisfy the conditions on qo(x). Reasoning 


similarly for q,(x) we get 


_ W(X)(X + Xo — 2X4) | _ W(x)(X + X4 — 2X) 
(5) Qo(x) = Say 5 Q(X) = ly 


where (x) = (xX — Xo) (X — X,). 

Finally we remark that unfortunately this approach cannot be extended to higher 
degrees. For example consider the quintic p(x) such that p(x;) = fi; p’x) =F; 
i= 0,4,2. With x5 = —1; x, = 0; x, = 1, the determinant of the coefficient 
matrix for p(x) is zero. 


References 


1. J. H. Ahlberg, E. N. Nilson, and J. L. Walsh, The Theory of Splines and Their Applications, 
Academic Press, New York, 1967. 


930 L. C. JANSSON AND R. T. HEIMER [October 


2. T. N. E. Greville, Theory and Applications of Spline Functions, Academic Press, New York, 


1969. 
3. T. J. Rivlin, An Introduction to the Approximation of Functions, Blaisdell, Waltham, Mass., 


1969. 
4. B. Wendroff, Theoretical Numerical Analysis, Academic Press, New York, 1966. 


MATHEMATICAL EDUCATION 
EpITepD BY J. G. HARVEY AND M. W. POWNALL 


Material for this Department should be sent to Shirley Hill, Department of Mathematics, 
University of Missouri, Kansas City, MO 64110, or to Paul Mielke, Department of Mathe- 
matics, Wabash College, Crawfordsville, IN 479338. 


ON BEHAVIORAL OBJECTIVES IN MATHEMATICS EDUCATION 


L. C. JANSSON and R. T. Heimer, Pennsylvania State University 


In a recent issue on the MONTHLY, Allendoerfer [1] discussed with some insight the 
problem of the selection of elementary mathematics textbooks. He notes, correctly, 
we believe, that textbooks are marketed on the creative writing abilities of advertisers 
rather than on proven quantitative claims as to how well they teach or what their 
effects on students are. 

Allendoerfer goes on to discuss the notion of “‘behavioral objectives” as a key to 
the writing and evaluation of textbooks. Though the term is slandered by referring to 
it as “‘psychological jargon,” it appears to be of enough value that it serves as the 
basis for his entire article. 

A behavioral objective is simply a very explicit statement of what it 1s one expects 
a student to be able to do following a unit of instruction. One commonly employed 
form contains three components: 


(1) The Given: the circumstances under which a particular performance is to take place. In a 
problem situation the ‘“‘givens” of the problem would be included. 

(2) The Required Performance: an explicit statement of the task to be performed. 

(3) The Criterion: a statement of the conditions under which the performed task is judged to be 


satisfactory. 


Thus we must state not only the desired goal behavior, but also the conditions 
under which it is to take place and the conditions of acceptable performance. An 
example of an instructional objective for a modern algebra course written in this 
from follows: 


942 ELEMENTARY PROBLEMS AND SOLUTIONS [October 


Recommendations. Even though the exceptions total only 7% it is our personal 
belief the objective test did a more realistic job of evaluating performance in the course. 
We feel that at least the 30 students in the unacceptable range would have been evalu- 
ated unfairly if multiple-choice tests had been used exclusively in determining grades. 
We would have preferred to discuss the differences in the test scores with certain 
students, especially those in the unacceptable range, but this proved to be impractical. 

In addition to the possible incorrect evaluation factor, there is another equally 
important implication in using multiple-choice tests exclusively. While they can be 
effectively used to check mathematics skills we feel rather strongly that their exclusive 
use discourages the development of the student’s ability to communicate using 
mathematics.Of course, this has not been verified statistically in this report, but our 
experiences with similar students in follow-up courses leads us to this belief. Certainly 
conscientious marking of homework papers can also help in the development of ade- 
quate mathematical communication skills. In conclusion, we suggest that college in- 
structors make a thoughtful evaluation of their use of multiple-choice exams. We 
would be pleased to hear from others with thoughts on this subject. 


PROBLEMS AND SOLUTIONS 
EDITED BY Emory P. STARKE 


ASSOCIATE EpiTors: JOSHUA BARLAZ, ERIC S. LANGFORD. COLLABORATING EDITORS: LEONARD 
CARLITZ, GULBANK D. CHAKERIAN, HASKELL COHEN, S. ASHBY Foote, ISRAEL N. HERSTEIN, 
MuRRAY S. KLAMKIN, DANIEL J. KLEITMAN, ROGER C. LYNDON, MARVIN MARCUS, CHRIS- 
TOPH NEUGEBAUER, ALBERT WILANSKY, AND UNIVERSITY OF MAINE PROBLEMS GROUP: EARL 
M. L. BEARD, GEORGE S. CUNNINGHAM, CLAYTON W. DODGE, OSKAR FEICHTINGER, WILLIAM 
R. GEIGER, RAMESH GUPTA, GARY HAGGARD, PHILIP M. LocKE, JOHN C. MAIRHUBER, CURTIS 
S. Morse, GRATTAN P. Murpuy, EDWARD S. NORTHAM AND WILLIAM L. SOULE, JR. 


All problems (both elementary and advanced) proposed for inclusion in this Department should 
be sent to E. P. Starke, 1000 Kensington Ave., Plainfield, NJ 07060. Proposers of problems 
are urged to enclose any solutions or information that will assist the editors. Ordinarily, prob- 
lems in well-known textbooks and results in generally accessible sources are not appropriate 
for this Department. No solutions (except those accompanying proposals) should be sent to 
Professor Starke. 


ELEMENTARY PROBLEMS 


Solutions of Elementary Problems should be sent to Problems Group, Mathematics Department, 
University of Maine, Orono, ME 04478. To facilitate their consideration, solutions of Elemen- 
tary Problems in this issue should be typed (with double spacing) and should be mailed before 
January 31, 1974. 


An asterisk (*) means neither the proposer nor the editors supplied a solution. 


1973] ELEMENTARY PROBLEMS AND SOLUTIONS 943 


E 2432*. Proposed by L.-S. Hahn, University of New Mexico 


For each natural number n, let f(n) be the smallest natural number that cannot 
be expressed as a combination of n n’s using only the rational operations of addition, 
subtraction, multiplication and division, together with unlimited use of parentheses. 
(No exponents, factorials, decimal points, etc., are allowed.) For example, f(4) = 10. 
Prove or disprove that f(n) is an increasing function of n for n = 3. 


E 2433. Proposed by R. T. Smythe, University of Washington 


Let x,,X2,°:: be a sequence of real numbers. What is sometimes called Kronecker’s 
Lemma asserts that if °_,n~'x, is convergent, then *,->0 as n—> oo, where 
X, =n '(x, +++: +x,). (That is, x,20 in Cesiro mean.) The converse of this 
lemma is false: let x, = (log(n + 1))°*. 

Prove the following partial converse: If %,>0 as noo, then D2 ,n °7*x 


converges for all 6 >0. 


n 


E 2434. Proposed by George O’Brien, York University, Ontario 


Suppose that {a,} and {b,} are sequences of nonnegative numbers such that 
(a,)" > a and (b,)" — b. Let p and q be nonnegative numbers such that p+ q = 1: 
Evaluate lim(pa, + qb,). Generalize. 


E 2435. Proposed by Wells Johnson, Bowdoin College 


Let p = 5 be a prime number. Prove that there exist at least two distinct primes 
di, > satisfying 1 <q; < p—1and(q,)” '41 (mod p”). 


E 2436. Proposed by E. T. H. Wang, University of Waterloo 


An n X n matrix with nonnegative entries is doubly stochastic if each row sum 
and each column sum is |. Characterize those doubly stochastic n x n matrices 
that commute with all doubly stochastic n x n matrices. 


E 2437. Proposed by David McLean, Warren, Michigan 


Let {a,} and {b,} be two sequences of real numbers that decrease monotonically 
to 0. If the series Xia, and 2b, both converge, then so does the series 1; max(q,, b,). 
Suppose that the series Xa, and 2b, both diverge. Does it follow that the series 
“min (a,, b,,) must diverge? 


SOLUTIONS OF ELEMENTARY PROBLEMS 
R[x] Can Contain |x 
E 2370 [1972, 772]. Proposed by John Hyde, Student, St. Olaf College 


Let R be a ring with identity. and let R[ x |be the ring of polynomials over R in 


944 ELEMENTARY PROBLEMS AND SOLUTIONS [October 


the indeterminate x. A current modern algebra textbook asks the student to prove 
that R[x] cannot contain \/x; that is, R[x] cannot contain a polynomial f(x) such 
that [ f(x)]? = x. Find an example of a ring R and a polynomial f(x) that disproves 
this. Can R be commutative? 


Solution by Robert Gilmer, Florida State University. For a ring R with identity 
and a positive integer n, let R‘”’ denote the ring of n x n matrices over R. If n> 1, 
then X has an nth root in R [X]. In fact, if 4, is the n x n matrix over R with 1 
on the main subdiagonal and zeros elsewhere and if B, is the n x n matrix over R with 
1 in the upper right hand corner and zeros elsewhere, then it is easy to show that 
(A, + B,X)" =X. The motivation for our choice of A, and B, comes from the 
canonical isomorphism ¢ between R“[X] and (R[X])”; A, and B, are chosen so 
that f(A, + B,X) is the companion matrix of Y" — X over R[_X ]. As a consequence, 
it follows that if S is the ring of infinite matrices over R, each of whose rows and 
columns contains only finitely many nonzero entries, then for each k > 1, X has a 
kth root in S[X]; if A is the element of S with A, down the main diagonal and zeros 
elsewhere and if B is the element of S with B, down the main diagonal and zeros 
elsewhere, then (A + BX)'= X. More generally, if re R, then (A + rB)*= rl, where 
I is the identity element of S. Thus if we identify R with its image under the isomor- 
phism r > rl of R into S, then each element of R has a kth root in S, for each positive 
integer k. 

If the ring R iscommutative, then X has no nth root in R[ X] for n > 1, for if P 
is a proper prime ideal of R and if up is the natural homomorphism from R onto 
R/P, then p,*: R[X]—>(R/P) [X], defined by p(2r,X*) = Lup(r)X‘, is a homo- 
morphism such that ws(X) = X, and since R/P is an integral domain, it is clear that 
X is not the nth power of an element of (R/P)[X]. It is of some interest to note 
that an element f of R[ X ] may be an nth power modulo P for each proper prime ideal 
P of R while fis not an nth power in R[ X ]. This occurs, for example, if R = Z/(4) 
and f=2+ X". 


Also solved by 43 others. 


An Elusive Cubic Equation 
E 2371 [1972, 772]. Proposed by M. H. Greenblat 


A clever graduate student (CGS) was discussing a mathematical problem with his 
friend, the absent-minded professor (AMP). The CGS asked. ‘‘Do you remember 
the cubic equation we solved several weeks ago, you know the one in which the 
coefficients of all the terms were positive integers? It had integral roots, and the 
coefficient of the cubic term was unity? 

AMP — ‘‘Well, I remember it only vaguely.’’ 
CGS — “I'd like to reconstruct it. Do you remember the value of the constant 
term?’’ 


1973] ELEMENTARY PROBLEMS AND SOLUTIONS 945 


AMP — ‘‘Not precisely. I remember it was either 2450 or 2540.”’ 


CGS — ‘‘Well, do you remember the coefficient of the square term?’’ 

AMP — “‘I’m afraid not, but it wouldn’t help you even if I did remember it.’’ 
(In this, he underestimated the CGS.) 

CGS — ‘‘Aha! Was the coefficient of the linear term as high as it could possibly 
be?”’ 

AMP — “Yes.”’ 


At this point, the CGS knew the equation in question. You can, too, with the 
above information. 


Solution by the St. Olaf Problem Solving Group. Recall that 
a= ~x;, b= Ux;X;, and c= — X14X2X3, 


where x,, X2, X3 are the roots of x* + ax* + bx +c =0. Since a, b, c are positive, 
it follows that all roots are negative. Since | x 4x2%5| = 2450 or 2540 (2:5*-7? or 
27-5127), we list all possible combinations of three factors and check for cases 
which give identical linear coefficients. The only two are 


x, = —5, x, = — 10, x, = —49 and x, =-7, xX,= —7, x, = —50. 


The first case gives a linear coefficient of 785 and the second 749. Since the linear 
coefficient is to be as large as possible, we must have the polynomial equal to 


(x + 5)(x + 10)(x + 49) = x7 + 64x? + 785x + 2450, 


Also solved by Anders Bager (Denmark), Larry Byrd & Truett Mathis, D. W. Erbach (England), 
and the proposer. 


Editorial Note. No solver mentioned that no root could be positive by Descartes’ rule of signs. 
C. W. Karns stated that “‘the statement that AMP underestimated the ability of CGS implies that 
x1 + x2 + x3 is unique.” He then finds all values for the coefficient b, the maximum value yielding 
a= 2542, b= 5081, and c= 2540 (using the notation of the above solution). Twenty-one other 
solvers arrived at the same solution by ignoring the comment about the square term coefficient. 


Diagonals in a 0-1 Matrix 


E 2372 [1972, 773]. Proposed by E. T. H. Wang, University of Waterloo, 
Candda 


Let A be ann x n matrix with entries zero and one, such that each row and each 
column contains precisely k ones. A generalized diagonal of A is a set of n elements 
of A such that no two elements appear in the same row or the same column. Show 
that A has at least k pairwise disjoint generalized diagonals, each of which consists 
entirely of ones. 


946 ELEMENTARY PROBLEMS AND SOLUTIONS [October 


Solution by Mike Vitale, Skidmore College. The solution proceeds by induction. 
If k = 1, pick the n ones of A as the generalized diagonal. Suppose the assertion is 
true for 1,2,---,k —1, and each row and column of A contains exactly k ones. 
We first choose a single generalized diagonal of A which consists entirely of ones. 
Pick a one from each of rows 1,2,---,k at random, subject only to the condition 
that no two are in the same column. This can always be done, since each row contains 
k ones. Suppose each of the ones in row k + 1 lies in a column which has already 
been chosen. (If not, pick an available one from this row, and proceed.) In at least 
one of the rows 1,2,---,k there is a one which does not lie above a one in row k + 1. 
(For if not, the columns containing the ones of row k +1 have at least k+1>k 
ones in them, a contradiction.) Call this row i. Replace the element chosen from 
row i originally by the element not lying above a one in row k + 1, and from row 
k +1 choose the one lying below the original element chosen from row i. Now go 
torow k + 2; again, if there is an available one, choose it and go on to the next row. 
If not, there must be a row 1,2,---,i—1, i+1,---, k + 1 which contains a one not 
lying above a one in row k + 2. Perform the switch outlined above and go on to the 
next row. 

In this way we can choose a generalized diagonal of A which consists entirely of 
ones. Replace each of the ones chosen by a zero to obtain a new matrix A’ which 
has exactly k — 1 ones in each row and column. By induction, we can find k — 1 
pairwise disjoint generalized diagonals which consist entirely of ones. Together 
with the first generalized diagonal chosen, these form a set of k generalized diagonals 
consisting entirely of ones. 

Also solved by L. W. Beineke & R. E. Pippert, The Bennett College Team, R. B. Eggleton, Bill 


Knight, Joel Levy, L. E. Mattics, Henryk Minc, K. R. Rebman, Richard Sinkhorn, Phil Tracy, and 
the proposer. 


Editor’s Note. The solution above is self contained. Several readers pointed out that the result in 
question is an immediate consequence of well-known theorems in the literature. Minc cites D. KGnig, 
Theorie der endlichen und unendlichen Graphen (1936), chap. XIV, s. 3, Proposition B; also M. Marcus 
and H. Minc, Modern University Algebra (1966), Theorem 4.3. 


Balancing Integral Weights 


E 2374 [1972, 905]. Proposed by Judith Q. Longyear, Pennsylvania State 
University 


Suppose that a, S$ a, S++: Sa, are natural numbers such that a, + --- +a, =2n 
and such that a, # n + 1. Show that if n is even, then for some subset K of {1, 2, -+-,n} 
it is true that &;.,4; =n. Show that this is true also if n is odd when we make the 
additional assumption that a, 4 2. 


Solution by J. G. Mauldon, Amherst College. Let s,=a,+--»+ a, for 
k =1,2,+--,-n —1 and consider the set of n +1 (distinct) numbers 


1973] ELEMENTARY PROBLEMS AND SOLUTIONS 947 


{0, ay — Any S45 S29°""5 Sn—i}- 


By the Pigeonhole Principle, at least one pair of elements are congruent to each other 
modulo n. We distinguish four cases. 
(1) If a; —a, =0 (mod n), then a, =a, since OZ a, —a, 2 —n+1. Thus in 


this case it follows that a, =---=a,=2. If n = 2m is even, then obviously any 
m-element subset K will do, whereas if n is odd, this circumstance is prohibited by 
assumption. 


(2) If s; = s, (mod n) for some 1 Si<k <n —1, then since 2n -2 25, —s,;21 
it follows that s, — s; =n, 1¢., d;,, +°:- +a, =n, and we can take 


K={i+1, i+2,+5k}. 


(3) If s, =O (mod n) for some 1S kSn—1, then 1 Ss, <2n—1, so that 
necessarily s, =n, and we can take K = {1,2,---, k}. 

(4) If s, =a, —a, (modn) for some | < k Sn —1, then either k = 1, implying 
a, = 0 (mod n) and thus a, = n (since a, Sn) so that K = {n} will do, or else k 22 
in which case a, + --- +a, +a, =0 (mod n). In this latter case, 1S a,+---+ a, 
+ d,<a,; +a, +°+a,++: +a, =2n so that a,+---+a,+a,=n, and we 
can take K = {2,3,---,k,n}. 

Also solved by Bennett College Team, M.T. Bird, R. B. Eggleton, C. S. Gardner, S. I. Gendler, 
R.A. Gibbs & H.S. Stocker, Michael Goldberg, M. G. Greening (Australia), Erwin Just, Jonathan 


Kane, E. M. Klein, O. P. Lossers (Netherlands), John MacBain, W. D. Markel, W. W. Parsons, 
Kenneth Rosen, Michael Shimshoni (Israel), Alan Stein, Guy Torchinelli, and Charles Wexler. 


Editorial Comment: This problem has the following interesting interpretation. Suppose that one 
has a number of blocks of integral weight, which average 2 weight units apiece, and which have the 
property that the heaviest block is not heavier than all the rest of the blocks put together. Then, with 
the exception of the case where one has an odd number of blocks all of weight 2, it is always possible 
to separate the blocks into two groups of equal weight. Moreover, it can be shown that the following 
algorithm will always lead to a solution of the problem: Take a two-pan balance, and starting with 
the heaviest weight and proceeding step-by-step with the next heaviest, etc., place the weights one at 
a time on that pan of the balance which is the lighter (if the pans are balanced, choose either). Then 
the placing of the last (lightest) block must necessarily make the two pans balance exactly. 


Difference Sets of Commutative Groups 


E 2375 [1972, 905]. Proposed by H. Kestelman, University College, London, 
England 


Let G be an abelian group. For any subset S of G, let D(S) denote the set of 
differences x — y, where x, y ¢ S. Show that if A and B are any subsets of G such that 
G = AUB, then either D(A) 2 B or D(B) = A. Show further that if G= AUB 
and if A and B are not disjoint, then D(A) = G or D(B) = G. 


Solution by John Coolidge, Florida State University. If B¢ D(A), then there 


948 ELEMENTARY PROBLEMS AND SOLUTIONS 


exists an element be B\D(A). For any ac A,b+aé¢A, since this would imply 
b =(b + a) —aeé D(A); hence b+ aeéB. Thusa =(b + a) — be D(B) andA & D(B). 

Let S—x={s—x|seS}. Since ANB#@, let xe ANB. Obviously 
Oe(A — x) N(B— x) and G = (A — x) U(B— x). By the preceding result assume 
(A — x) S D(B—x). Since OE B— x, B—x & D(B—x). Therefore Gc D(B — x). 
Also D(B—x)S& D(B) because (b, — x) —(b,—x) = b, — b,€D(B). Hence 
G = D(B). 


Also solved by S. Baskaran (India), Bennett College Team, D. M. Bloom, W. E. Bodden, Ramon 
Casanova, John Christopher, Alonzo Church, Jr., J. Coolidge, Patty DeLoach, Mary Ellien, Ali Fora 
(Jordan), Scott Forrest, C.S. Gardner, J. D. Gillam, M. G. Greening (Australia), J. W. Grossman, 
J. Haja, N. E. Harrison, C. V. Heuer, Erwin Just, Jonathan Kane, Charlotte Krauthamer, G. D. 
Kruse, S. E. Landsburg, O. P. Lossers (Netherlands), Luther College Problems Seminar, Milan Lustig 
(Czechoslovakia), D. E. Manes, L. E. Mattics & M. E. von Wolff, Gary McDonald & Merry Mc- 
Donald, D. L. Milgram & Azriel Rosenfeld, C. L. Morgan, C. M. Price, Kenneth Rosen, W. J. 
Sanchez, George Schillinger, J. R. Smith, Christopher Steel, Temple University Problem Solving 
Group, Guy Torchinelli, Mike Vitale, Albert Wilansky, Qazi Zameeruddin (India), and the proposer. 


Editor’s Comment: Gary and Merry McDonald showed that the restriction that G be abelian 
can be omitted only if G is finite. 


Prime Powers and the Sigma Function 


E 2376 [1972, 906]. Proposed by Arthur Marshall, Madison, Wisconsin 


Suppose that p and g are odd primes and that a and b are natural numbers such 
that p* > q’. Show that if p* divides the product o(p*)o(q”), then in fact p* = o(q’). 


Solution by Temple University Problem Solving Group. We have 
o(p’) =1+pt--+ p* = 1 (mod p). 
Therefore (p*, o(p*)) = 1. Thus, if p* divides o(p*)o(q’), then p*| o(q’). Note that 


b 
q —1 b b 

2q. 
gq-1 14% <“4 


og )=1tqt+qQ'+@ = 


But q” < p*. Therefore o(q’) < 2p’. Also 1<a(q”), so p*| o(q”) implies that o(q”) = p’*. 


Also solved by Anders Bager (Denmark), Merrill Barnebey, Jay Boekhoff, T. B. Carroll, John 
Christopher, Scott Forrest, Joe Illick, Eleanor G. Jones, J. W. King, E. M. Klein, N. J. Kuenzi & 
Bob Prielipp, L. Kuipers, S. E. Landsburg, P. A. Lindstrom, O. P. Lossers (Netherlands), Andrzej 
Makowski (Poland), J. B. Muskat (Israel), Kenneth Rosen, T. Salat (Czechoslovakia), Hwa Tang, 
Guy Torchinelli, Phil Tracy, W. R. Umbach (W. Germany), Mike Vitale, Roger Weitzenkamp, 
Charles Wexler, and the proposer. 


Editorial Note. Several solvers pointed out that the primes need not necessarily be odd. The 
solution above is valid also if either p or gq is 2. 


ADVANCED PROBLEMS 


All solutions of Advanced Problems should be sent to J. Barlaz, Rutgers — The State University, 
New Brunswick, N. J. 08903. Solutions of Advanced Problems in this issue should be typed (with 
double spacing) on separate, signed sheets and should be mailed before January 31, 1974. 
Contributors (in the United States) who desire acknowledgement of receipt of their solutions 
are asked to enclose self-addressed, stamped postcards. 


5928. Proposed by Carl Pomerance 


If n is an integer greater than 1, let t(n) equal the number of squares in the ring 
Z /(n)Z. Find a formula for s(n) = lim,.,,,t(n®)n-". 


5929. Proposed by Frederick Stern, California State University at San Jose 
Let 0 < x < 1. Show that 


(1 G- ») = i+ x 


= 1 


xh +i2t+-+im 


x _ OY 
S (1—x)(1 — x!?)--- (1 — x) 


where S = (i, i,°::,i,,) 18 any set of positive integers such that i, <i, <--- <i,. 


5930. Proposed by J. J. Buckley, University of South Carolina 


Let © be the Lebesgue subsets of R and let # be the Borel subsets of R. A problem 
in Halmos, Measure Theory, p. 143 implies that if f: R— R is Lebesgue measurable, 
then the graph of f belongs to the product o-algebra ~Y x #. Is the converse true? 


5931. Proposed by C. J. Smyth, University of Turku, Finland 


Let y be a PV-number (an algebraic integer with || > 1, and conjugates y = y,, 
Yas ***s Yn» With | y;| <1 (i = 2,---,m). Show that for i # j, | y;| =| 7;| implies y; = 7,. 


5932. Proposed by Richard Stanley, University of California at Berkeley 


Call two permutations o and p in the symmetric group S, equivalent (denoted 
ao ~ p) if every cycle c in the disjoint cycle decomposition of ao is some power 
(depending on c) of a cycle in the disjoint cycle decomposition of p. It is easily seen 
that ~ is an equivalence relation. Let E(n) denote the number of equivalence classes. 
Show that 


LMé 


E(n) x" /n! = exp (= *"ig()). 


n 


where @ is the Euler totient function. 
5933. Proposed by P. L. Renz, Wellesley College 


Let Y be a family of random graphs constructed on a fixed countably infinite 


949 


950 ADVANCED PROBLEMS AND SOLUTIONS [October 


vertex set V and having the property that the probability of {v, v’} being an edge of a 
graph G of is p, with O< p <1, for each distinct pair of vertices v and v’ in V. 
What is the probability that a graph G in Y contains an infinite complete subgraph? 


SOLUTIONS OF ADVANCED PROBLEMS 


k-Chromatic Graphs 
5855 [1972, 523]. Proposed by Ioan Tomescu, Ploiesti, Rumania 


Show that any k-chromatic graph on n vertices none of which are isolated must 
have at least 4k(k — 1) + 4(n — k) edges. 


Solution by A. M. Hobbs, Texas A and M University. Choose a k-coloring of a 
k-chromatic graph G. For each color c, if no vertex of color c is adjacent to vertices 
of all other colors, every vertex of color c can be recolored. Since this is impossible, 
for each of the k colors there is a vertex of degree k — 1 or more which is colored 
with that color. We have a lower bound for the degree of only one vertex of each 
color; for each of the other n — k vertices there is at least one edge incident to it (no 
isolated vertices) and so the sum of the degrees in G is = k(k — 1) +(n —k). Thus 
the number of edges in G equals half of the sum of the degrees of the vertices in G, 
1e., is 2 4[k(k—1)+(n—k)]. 


Also solved by B. A. Broemser, Jacques Chone (France), D. J. Kleitman, O. P. Lossers (Nether- 
lands), John Mitchen, B. R. Myers, Thomas Peterson, Simeon Reich (Israel), D. P. Sumner, Jeanne 
K. Tamaki, and the proposer. 


Editorial Note. Broemser with his solution observes that the minimum number of edges is 
either attainable (7, k, same parity) or is attained by adding 1/2 (n, k opposite parity). 


Identities for Finite Subsets of a Set 
5856 [ 1972, 523]. Proposed by Jan Mycielski, University of Colorado 


For any collection X of finite subsets of a set S we denote by X* the collection of 
all finite subsets T of S such that the number of subsets of T which belong to X is odd. 
Prove that X** = X and (XAY)* = X*AY*, where XAY = (X UY)—-(X NY). 


Solution, composite of contributions by Fred Galvin, University of California 
at Los Angeles, and R.W. Quackenbush, University of Manitoba. Let F(S) denote 
the collection of all finite subsets of S. Note that 

X* = {Te F(S):|27NX| is odd}. 


1. |X 27| +] ¥ 27| = |(XAY) N27] + 2/)(X NY) N27], 
whence 
|(XAY) 127| = |X A27|+|¥A27| (mod 2), 


and thus (XAY)* = X*AY*. 


1973] ADVANCED PROBLEMS AND SOLUTIONS 951 


2. O* =O, so OG** =O. 
3. If X = {T}, then X* = {Re F(S): T GR}, and X¥** = X. 
4. For finite X, X** = X is proved by induction using (1), (2) and (3). 
5. Let T&S. For X c F(T), define 
X' = {UeF(T):|2° AX|_ is odd}. 
Then for X ¢ F(S), Ue F(T), we will have U €(X M2')’ if and only if 
}2” A(X 127)| = |X 02"| 


is odd, if and only if Ue X* 27. Thus (X¥ N2')’ = X* M2! for all T ¢ S and 
all X < F(S). 
Now let X c F(S) be infinite and T ¢ F(S). Then T ¢ X** if and only if 


Te X** O27 = (X* 02") = (X N27) = X N2' 
(since T is finite) if and only if Te X . Thus X** = X. 


Also solved by A. Ehrenfeucht, Ellen Hertz, Eitan Lapidot (Israel), William Massey, L. E. 
Mattics, P. L. Renz, and the proposer. 


Note 1. Galvin suggests a generalization: Let P be a finite partially ordered set. If A C P, 
let A* = {x eP:| {yEA, y < x} is odd}. Then 


(i) (AAB)* = A*AB* for all A, BC P. 
(ii) A** = A for all A C P, if and only if | {x EPa<cx< b} | is even for all a,b € P. 


Note 2. The proposer adds the following comment: Let S = {1, «++, nj. It is well known that 


every Boolean function f(x,,---,x,) can be represented in two ways 
f(% 45°75 Xp) = VV (A Xi /\ /\ 7x;), 
Tex ieT ie S—T 
T(x, sty Xq) = py I] Xj (mod 2) 
TeYieT 


Where X and Y are suitable families of subsets of S. The relationship between X and Y is precisely 
X* = Y, 


A Sum of (0, 1) Random Variables 


5857 [1972, 523]. Proposed by Gérard Letac, Institut Universitaire de Tech- 
nologie, Aubiére, France 


X,,X5,°:', X,°** being independent random variables such that P(X,=0) 
= P(X, = 1) =4, define S, = Xj_, X,/2'. Take a set H of rational numbers of the 
form a/2”, such that H is dense in [0, 1]. Prove or disprove that P(at > 0: S,¢ H) = 1. 


Solution by Ellen Hertz, Bronx, N.Y. Let e>0. Then there exists a set H of 
rationals of the form a/2° such that H is dense in the unit interval but P(St: S,¢ H) 
< &. 


952 ADVANCED PROBLEMS AND SOLUTIONS [October 


Proof. S, is distributed uniformly on the 2‘ points 0, 1/2',---,(2'— 1)/2’. Let D be 
the set of all rationals in (0,1) of the form a/2". If pe D define n(p) by setting p 
= aq/2"”), a odd. For any given peéD, P(S,=p)=1/2' if t = n(p) and P(S,= p) =0 
if t <n(p). Hence 


Pat: S,=p)S XY 1/2*= 1/2 -*, 
t=n(p) 

Since there are only finitely many values of p for each value of n(p), then 
n(p) is unbounded in any subinterval of (0, 1). 

Construct H as follows: Let 1/2"~7 <. 

Step 1. Select 0 < p, <4 such that n(p,) 2 No. Select $< p, <1 such that 
n(p2) > n(p,). 

Step n. Select values of pygy+1yj2+ 1 = 9,1,---,n such that i/(n+1) < Dacns ty24i 
<(i+1)/(n + 1) always requiring that n(p,4,) > n(p,), k =1,2,---. 

Then 

P(t: S,eH) < DY Pt: S,=p,) < LD 1/2%~e-'<s LY 1/2* = 1/2", 

k=1 k=1 


k=no-—1 


Also solved by J. C. Kieffer, S. P. Lloyd, L. E. Mattics, Daniel Mosenkis, P. van der Steen 
(Netherlands), and the proposer. 


An Irrational in a Covering of the Rationals 


5858 [1972, 523]. Proposed by Leonard Gallagher, University of Colorado 
Let Q = {r,'s_, be any enumeration of the rationals and consider open intervals 
I; = N,,(r;) about r;. Since 


G= 0 U J" 


n=1 i=1 
is a Gs set, OQ 4 G. Demonstrate an irrational element of the G; set. 


Solution by G. A. Heuer, Concordia College. For each i, let r; = m,/n;, where 
m, and n, are relatively prime integers and n, > 0. Let gq, =r, and ¢#(1) =1. When 
Ix = Vocuy IS given, let qy41=Tgautiy, Where O(k + 1) is the first integer greater than 
o(k) such that n>, and | qu+1 — | < min {1/2nf*, 1/2%***"}. Since n, = k, 
the sequence {q,} converges to some number €. 

For each h, 


Ie — Tocny| =|¢—-4,| Ss 2 [deem a | 


<min | 1/2ni*, d yaemene st 
k=h 


k=h 


1973] REVIEWS 953 


<min{d 1/(Qn™-2*-"), DY 1/20 teko ket 
k=h k=h 


< min {1 /njpr, 1/297", 


where in the next to the last step we use (n,.,)"**' > (n,)"*** 2 2n,* and (k + 1) 
> $(k) + 1. Thus € is a Liouville number (hence transcendental) and €€ 13) *" for 


each h. Therefore €€G. 


Also solved by R. J. Evans, O. P. Lossers (Netherlands), R. H. Marty, Lieselotte Miller, and the 
proposer. 


REVIEWS 


EpITEeD BY J. ARTHUR SEEBACH, JR. AND LYNN A. STEEN 


with the assistance of the mathematics departments of St. Olaf and Carleton Colleges 
COLLABORATING EDITOR FOR FILMS: SEYMOUR SCHUSTER, CARLETON COLLEGE 


We invite readers to submit reviews of significant recent college-level mathematics books. 
We especially encourage reviews based on classroom use, or comparative reviews of several 
related books. Reviews should ordinarily not exceed two pages (per book) typed double spaced. 
Manuscripts of reviews as wellas books submitted for review should be sent to: Book Review 
Editor, American Mathematical Monthly, St. Olaf College, Northfield, MN 55057. 


Set Theory and Metric Spaces. By Irving Kaplansky. Allyn and Bacon, Boston, 
Mass. 1972. xii + 140 pp. $9.95. (Telegraphic Review, June/July 1972.) 


How enjoyable it is to find a book in mathematics that is just plain fun to read. 
Unfortunately, this doesn’t happen very often. It’s a pleasure to report that Set 
Theory and Metric Spaces is one of these enjoyable additions to the field. 

Irving Kaplansky returns in this case to speak to the undergraduate on what 
sounds to be a very ordinary topic. But to say “‘speak”’ is the main reason the book 
is not ordinary, for the book has the ring of the small classroom lecture with all of 
its freedom, humor, and patience. The author does this through an amazingly fresh 
and descriptive use of language — a conversation with the reader. 

The book material itself has grown from a set of notes initially prepared by Spanier 
and used as a text at the University of Chicago. Kaplansky’s goal is to present the 
working essentials of set theory that a mathematician needs to know along with 
some basic topics in the study of metric spaces. 

In this philosophy, the author’s introductory set theory is initially terse and some- 
what intuitive. As he states himself, his set theory is “‘supernaive”. He presents just 


THE AMERICAN 
MATHEMATICAL MONTHLY 


(FOUNDED IN 1894 By BENJAMIN F. FINKEL) 
THE OFFICIAL JOURNAL OF 


THE MATHEMATICAL ASSOCIATION OF AMERICA 


NUMBER 9 


VOLUME 80 
CODEN: AMMYAE 
CONTENTS 
The Spinor Spanner... . . . . E.D.BOLKER 977 


Linear Combinations of Sets of Consecutive Integers D. A. KLARNER AND R. RADO 985 
The Equation x’(t) = ax(t) + bx(t — +) with “‘Small’’ Delay . , 
R.D. Driver, D. W. SASSER AND M. L. SLATER 990 


The Quaternion Calculus Coke ee . . . .€,A. DeEAvouRS 995 
Inequalities for Sums of Distances . . . G. D. CHAKERIAN AND M.S. KLAMKIN 1009 
The William Lowell Putnam Mathematical Competition . . . . J.H.McKay 1017 
Simple Groups . . . . . ee ee we 1028 
MATHEMATICAL NOTES 
On a Problem Concerning Euler’s Phi-function . . . . . HAROLD DONNELLY 1029 
What is the Probability that Two Group Elements Cormute? . W.H.Gustarson 1031 
Remarks on the Bessel Polynomials . . . . . . . . . +. + C.W. Barnes 1034 
A Micronote on a Functional Equation. . . . . . . H.N.S#HApiRO 1041 
An Addendum to the Paper ‘“‘A Characterization of the n x n Matrices over a 
Finite Field” 2. 2... . . SCS. V. BRAWLEY AND L.Caruitz 1041 


RESEARCH PROBLEMS 
Exploring a Planet . . Coke ee eee ))~)CUL Feves TOTH = 1043 
What are the Latin Square Groups? 2. 
. J. J. CARROLL, G. A. FISHER, A. M. ODLYzKo, AND DN. J. A. SLOANE 1045 


CLASSROOM NOTES 
Geometric Fit of a Monotonic Cubic... . . .  . W.P. Cooke 1047 
A Familiar Combinatorial Identity Proved by Complex Analysis . . STEVEN MINSKER 1051 


(Continued on inside cover) 


NOVEMBER 1973 


MATHEMATICAL EDUCATION 
A Discovery Course in Graph Theory . . . . . . . . +. J. L. LeEonArp 
A Bowto Relevancy ....... . . . . « « »«  R.L. Witson 
Concerns of Two-year Colleges , 
. G. F. Gitmer, H. B. SINER, R. MANSFIELD AND ) W. G. CHINN 
ELEMENTARY PROBLEMS AND SOLUTIONS . 
ADVANCED PROBLEMS AND SOLUTIONS 
REVIEWS . ; 
NEWS AND NOTICES 
MATHEMATICAL ASSOCIATION ¢ OF AMERICA 
Employment Information for Mathematicians 
April Meeting of the Indiana Section 
April Meeting of the Metropolitan New York Section . 
April Meeting of the Southwestern Section 
April Meeting of the Texas Section 
May Meeting of the Allegheny Mountain Section 
May Meeting of the Rocky Mountain Section 
May Meeting of the Seaway Section . 
Calendars of Future Meetings . 


NOTICE TO AUTHORS 


1052 
1053 


1055 
1057 
1067 
1072 
1090 
1090 
1090 
1091 
1092 
1093 
1093 
1094 
1096 
1097 
1098 


Specialized research is usually unsuitable; see Statement of Policy (vol. 76, p.2). Manuscript preparation: Please 
use the Manual for Monthly Authors (vol. 78, p. 1) and follow the format in current issues of the MONTHLY. 
Manuscripts should be typewritten, triple-spaced with wide margins; submit two copies and keep one for 


protection against loss. 


Backlog: Main Articles 12 months, Math. Notes 13 months, Research Problems 7 months, Classroom Notes 


11 months, Math. Education 10 months. 


EDITORIAL CORRESPONDENCE AND MAIN ARTICLES: to ALEX ROSENBERG, Department of Mathe- 
matics, Cornell University, Ithaca, N.Y. 14850; NOTES, etc.: to the corresponding Associate Editor; 
ADVERTISING CORRESPONDENCE: to RAOUL HAILPERN, Mathematical Association of America, 
SUNY at Buffalo, Buffalo, N. Y. 14214; CHANGE OF ADDRESS and SUBSCRIPTIONS: to A. B. 
WILLcox, Mathematical Association of America, 1225 Connecticut Ave., N.W., Washington, D.C. 20036. 


HARLEY FLANDERS, Editor 
ALEX ROSENBERG, Editor-Elect 
ASSOCIATE EDITORS 


JOSHUA BARLAZ J. G. HARVEY SEYMOUR SCHUSTER 

E. R. BERLEKAMP ERIC S. LANGFORD J. ARTHUR SEEBACH, Jr. 
JANE W. DI PAOLA P. D. LAX E, P. STARKE 

ROBERT GILMER ARTHUR MATTUCK LYNN A. STEEN 
RICHARD GUY M. W. POWNALL JAMES WENDEL 

RAOUL HAILPERN GIAN-CARLO ROTA 


Annual dues for members of the Association (including a subscription to the American 
Mathematical Monthly) are $12.50. For nonmembers the subscription price is $18.00. 


PUBLISHED BY THE ASSOCIATION at Washington, D. C., and Menasha, Wisconsin, during the months of January, 


February, March, April, May, June-July, August-September, October, November, December. 
Second-class postage paid at Washington, D. C., and additional mailing offices. 
Copyright © The Mathematical Association of America (Incorporated), 1973 


PRINTED IN THE UNITED STATES OF AMERICA 


THE SPINOR SPANNER 


ETHAN D. BOLKER, University of Massachusetts, Boston 


1. Introduction. Consider a wrench, which is an object asymmetrical enough 
so that the result of any proper rotation performed on it is easily recognized. Rotate 
the wrench through a full 360° turn about an axis. Has it returned to its original 
state? Physical and geometric intuition both say ‘‘yes’’, yet the calculus of spinors, 
which models the quantum mechanical behavior of neutrons, predicts that the answer 
would be ‘‘no’’ if the wrench were a neutron, or any other Fermion, a particle with 
half integral spin. More striking still, the predicted answer is “‘yes’’ for two full 
turns (720°) about the same axis. No experiment has yet been performed to verify 
these predictions, because beam splitters and interferometers for beams of polarized 
neutrons do not yet exist, but several such experiments have been imagined [1], [2]. 
There is, however, an easy experiment with an analogous outcome. P. A. M. Dirac 
invented it to lessen, in lectures, the implausibility of the neutron’s predicted be- 
havior [3]. Consider the wrench again, which Dirac would have called by its English 
name, a spanner, hence a spinor spanner because of the use to which he put it. 
Attach it by three cords to the walls of the room. (See the solid lines in Fig. 1.) 


—l 


When we turn the wrench through 360° the cords become tangled (the dashed lines 
in Fig. 1); no tampering can undo that tangle as long as the wrench is fixed. After 
two full turns (the dotted line in Fig. 1) the snarl seems worse but is not. Before 
reading further, find a wrench, perform the experiment, and convince yourself of 
the striking fact that after two full turns the cords are essentially untangled. The 
geometry of the spinor spanner is the key to Piet Hein’s topological game Tangloids, 
described by Martin Gardner in the Scientific American [9], and to an ingenious 
device invented and patented by D. A. Adams which allows a rotating platform 
to be connected to a stationary base with a flexible cable without using slip rings 
or rotary joints [8]. 

I first saw the spinor spanner demonstrated by Norman Ramsey, a physicist, 
while I was a graduate student. In this paper I shall explain in mathematical terms 
why the spinor spanner works, and indicate how that explanation can be couched 


ee 


Fic. 1 


Ethan Bolker received his Harvard Ph.D. under Andrew Gleason. He has held positions at 
Princeton, Bryn Mawr College and the University of Massachusetts, Boston. He spent a year’s leave 
at Berkeley, and a second at Harvard. His main research interest is in combinatorics. He wrote up 
Professor Lynn Loomis’s lecture notes on Harmonic Analysis put out by the M.A.A. in 1965, and he 
is the author of Elementary Number Theory, An Algebraic Approach (W. A. Benjamin, 1970). Editor. 


97] 


978 E. D. BOLKER [November 


in language suitable for mathematics clubs and more general mathematically naive 
audiences. Someday, I should like to make a movie of the spinor spanner. 

We are about to show that the fundamental group G of SO(3), the group of proper 
rotations of Euclidean 3-space, is of order 2, and to exploit the proof to find a method 
for untangling the cords. Since Fermions correspond to representations of the double 
covering group of SO(3) which do not factor through SO(3) itself, the fact that the 
order of G is 2 really accounts both for the spinor spanner and for the neutron’s 
behavior. 


2. Homotopy. Let X be a topological space and x, a fixed point in X. A naive 
audience could think of X as a smooth part of some Euclidean space, say the surface 
of a sphere, or a solid torus, or an annulus. A loop in X is a continuous function 
P: [0,1] > X for which P(O) = P(1) = xo. If you think of X as a park then a loop 
may be thought of as the record of an hour’s walk in X, starting and ending at xy. 
Be sure to distinguish this precise usage from the more customary meaning of ‘‘closed 
path in a park.’ The latter is the image of the function P. Two loops P and Q are homo- 
topic, written P ~ Q, when one can be continuously deformed into the other. For- 
mally, P ~ Q when there is a continuous f:[0,1]x[0,1]— X for which f(0,s) = 
f(1,s) = x9, f(t,0) = P(t) and f(t, 1) = Q(t). Informally, suppose that you walk 
your dog in X: you follow P while he follows Q. Then P ~ Q means that when the 
walk is over the leash joining the two of you can be pulled in without encountering 
any parklike obstacles, trees or lakes, which you and your dog passed on opposite 
sides of. This interpretation makes clear the importance of the direction in which 
you traverse the curve which is the image of P. If P is a sense preserving repara- 
metrization of Q then P~Q. The loop corresponding to the lazy man’s walk 
is the constant loop 0 defined by O(t) = x, for all t. 

Now let us consider taking two walks in succession. We shall denote ‘‘P followed 
by Q”’ by ‘““P@Q’’. Asa function, P @ Q is defined by 


PQ) if O<t<1/2 


P — 
bo) locr 1 i 2<t<1. 


It is intuitively clear and not hard to prove that homotopy is an equivalence relation, 
that the homotopy class of P @ Q depends only on the classes of P and of Q, and 
that the set of homotopy classes is a group under @. Details can be found in many 
topology texts (for example, [5] and [7]). The group is not usually abelian, but I 
have found additive notation less confusing than multiplicative for naive audiences. 
. Observe that 0 is the group identity: P@0 ~ P. Our job now is to find the inverse 
of P, the solution to P@®? ~ 0. The dog walking analogy can lead us to a good 
guess. If you are lazy while your dog follows P then his leash will be tangled when 
he returns, unless, by chance, P~ 0. How could you untangle the leash? If the dog 
is intelligent the answer is easy: ask him to retrace his steps. That is, if we define 
the loop —P by (—P)(t) = PA—d then P@(—P) ~ 0. 


1973] THE SPINOR SPANNER 979 


3. Pasting. The topological spaces we can visualize as smooth parts of 2- or 
3-space are too simple to help us analyze the spinor spanner. We need a method for 
studying homotopy in more complicated ones. 

If we take a square piece of paper and paste together a pair of parallel sides, 
top to top and bottom to bottom, we have made a cylinder. We can study homotopy 
on the cylinder without actually pasting the square, as long as we remember that 
points along one edge are identified with corresponding points on the other. The 
idea of “‘pasting’’ can be made precise using quotient topologies, but we have no 
need for that much sophistication. For naive audiences it is instructive to mention 
the various spaces which can be obtained by pasting edges of a rectangle. They are 
the cylinder, the Mobius strip, the torus, the Klein Bottle, and, finally, the projective 
plane. The identifications which lead to these are symbolically indicated in Figs. 
2.1-2.5 respectively, in which some loops are sketched as well. The Klein Bottle 
and the projective plane cannot actually be constructed in 3-space but we can 
study them nevertheless. Fig. 2.5 suggests a more symmetrical view of the projective 
plane II. Since each pair of opposite points of the square is pasted, the corners 
assume no special role. We can build II from a disk A by pasting together each pair 
of antipodal points on the rim: in Fig. 3, these are pairs (A, A’), (B, B’), (C,C’), 
etc. 


Fic. 2 (2.1-2.5) 


Let x_ be the center and La directed diameter of A. Since the ends of Lare iden- 
tified when we build IT we can consider the loop P in II which begins at xo, follows 
Lto the rim of A and then returns to x, along the other half of L. We show next 
that P + 0; to do so we use a homeomorphic copy or model of II. Start with the 


980 E. D. BOLKER [November 


disk A and stretch it to form a closed hemisphere. Now consider the spherical sur- 
face & of which A is a part. If we paste together each pair of antipodal points of 
x, then II will result. To see this, paste first all the antipodal pairs one member or 
which lies in the interior of A. That yields the hemisphere into which A was stretched. 
The rest of the pasting, of the pairs on the equator, is just what to do to A to build IT. 
In this model for II the north and south poles n and s of 2 paste together to make xq. 
In 2 there is a unique continuous curve S which starts at n and which becomes P 
when X is pasted to form II, namely, the appropriate meridian joining n to s. That 
curve is not a loop in x. If P were homotopic to 0 in II we could lift that homotopy 
to & and so construct a continuous deformation in x of S to the constant loop 
at n during which the endpoints n and s of S remained fixed. Since such a deformation 
is clearly impossible, P ~ 0 in II. We can see too that P@ P~O, because P@P 
is the result in II of pasting a great circle through n and s in X. That great circle 
easily shrinks to the constant loop at n in x. But to untangle cords later, we must 
now show in another way that P@P~0O. Consider again our first model for II, 
obtained by pasting pairs of opposite points on the rim of A. 


Fic. 3 


Let M be another directed diameter of A, and let QO follow M in IT as P follows 
L (see Fig. 3). We can rotate L in A until it coincides with M; this rotation is a 
continuous deformation in II of P to Q. If we take for M the diameter L with its 
direction reversed then Q is —P, so P@®P~ P@(-—P) ~ 0. The projective plane 
thus surrounds a peculiar kind of hole. If you travel around it twice in the same 
direction you’ve not gone around it at all. That is analogous to what happens to 
the spinor spanner. In each case doubling something makes it vanish. But with the 
techniques of homotopy and pasting, we can do better than produce an analogy 
for the spinor spanner. We can predict and explain its behavior. 


4. The topology of SO(3). Let Q be the space of all possible configurations of 
the wrench. A point @ €Q is thus the result of a particular proper rotation. Remem- 


1973] THE SPINOR SPANNER 981 


ber, it is the configuration we are talking about, not the means by which the wrench 
came to that configuration. It is intuitively clear that Q is a nice topological space. 
Our complete turn of the wrench about an axis corresponds to a loop P in Q which 
begins and ends at the initial configuration @,). We shall show P + 0 but P@P~0 
in Q and then show how the homotopy which shrinks P @ P to 0 tells us how to 
untangle cords. 

We begin by building a model of Q. Replace the wrench by the surface of a 
sphere & centered at the origin. Then each weQ can be identified with a map from 
x to itself defined by letting w(c) = the position of oe when > is moved to con- 
figuration @. As a map, @ preserves distances and the orientation of spherical 
triangles. We next show, in two ways, that every such map has a fixed point. Since 
@ extends to a proper linear isometry of R? the roots of its cubic characteristic 
polynomial have product 1 and each is of absolute value 1. Thus those roots are 1, 
e'®, e~” for some @. Since 1 is a root, 1 is an eigenvalue and @ has a fixed point. This 
argument clearly works in R” if and only if n is odd. 


Here is a second proof in R3, suitable for audiences who know no linear algebra. 
Let & have circumference 2. For x, ye let u(x, y) be the least great circle distance 
between x and y. The function f/: 2 — R defined by f(x) = u(x, wx) is continuous 
and so assumes its minimum value 6 =O at some aex. If 6=0 then wa=a 
and w has the fixed point we desire. We shall show next that 6 > 0 implies 6 = 0. 
Suppose 6 > Q. Since @ is proper it cannot map every point to its antipode. Thus 
6 <1, so we can find a hemisphere H containing both a and wa. In H draw the 
great circle I joining a to wa; it has length 6. Now draw two circles C and D centered 
at a and wa respectively; make them so small that they lie in H and do not overlap. 
Let c be the intersection of C and [ and d the intersection of D and the continuation 
of [’. Since p(a,c) = p(@a, wc), wce D. But every point on D except d is less than 
6 units from c, so @c = d. Now let n and s be the poles for which I lies on the equa- 
tor. Then p(a,n) = p(c,n) = 1/2 so p(wa,an) = p(we,@n) = 1/2. Therefore 
won =n or s. But wn = s is impossible because w preserves the orientation of the 
spherical triangle acn. Thus wn = n, nis a fixed point, and 6 = 0. That is, a must 
have been fixed to begin with. 


Suppose @ 4 @,. Then @ has exactly two fixed points n,, s,,; which lie at oppo- 
site ends of a diameter of Y, and & can be brought to configuration w by a rotation 
of r radians about the axis n,, s,,. We wish to consider rotations which are counter- 
clockwise when we look down on n, from outer space; this is the familiar right hand 
rule. In lectures I use an inflatable globe to show a counterclockwise rotation of 
102° about the axis joining Bermuda to Perth, Australia, moves Duluth to the 
Panama Canal. Since a clockwise rotation about an axis is a counterclockwise rota- 
tion about the same axis with its north and south poles interchanged, and since 
rotations through r and r — 2z radians about an axis lead to the same w, we can 
describe an w # @ by giving a vector m(w) 4 0 with length | ma) | < 1m: m(q@) 


982 E. D. BOLKER [November 


points toward n,, from the origin, and || m(@) || = r. If we set m(a) = 0, the range of 
m is the solid ball B of radius z centered at 0. The function which inverts m is one to 
one except when | V | = 7, for rotations through z radians about V and — V lead 
to the same m. Thus © is modeled by the space X which results when we paste 
together each pair of antipodal points on the surface of the solid ball B, because 
m:Q—-X is a homeomorphism; one to one, onto, continuous, and with a contin- 
uous inverse. The loop P in X which corresponds to a full turn of the wrench about 
an axis L starts at the center of B, moves out along L to the surface and returns 
to the center along the other half of L. It is analogous to the loop with the same 
name we have just studied in If. In fact, II is a subspace of X in a natural way, 
so that the two loops we have named ‘‘P’’ coincide. Since P ~ —P in II, P~ —P 
in X. For those who like formulas, we give one for that homotopy. Let A be the 
intersection of B with the x,z plane and L the directed diameter which extends to 
the directed x axis. The homotopy which interests us rotates Lin A to change P to 
—P., The matrix for a right handed rotation through r radians about the axis in 
the x,z plane which makes an angle of 6 radians with L is 


cos*@+(cosr) sin*@  —(sinr)sin@ (1—cosr)sin@cosé 
f(r, 9) = (sin r) sin 0 COST —(sinr) cos 6 
(1 — cosr) sin @ cos 0 (sinr)cos@  sin?@ +(cosr) cos? @ 


The function f is continuous on [0,2z] x [0,z], f(- ,0) is the loop P, and f(-,z) 
is the loop —P, so f is our homotopy in Q. 

To prove P ~ 0 in X we cannot merely use the fact that P lives in the subspace 
II of X, for although no deformation of P to 0 is possible inside that subspace 
one might be possible in X . To rule that out we need a new model for Q analogous 
to our second model for II, the one we built by pasting antipodal points on the 
2-sphere &. Let ® be the 3-sphere in real 4-space. We can stretch B so that it covers 
a hemisphere of ®. Then Q results when we paste antipodal pairs in ®, since B results 
when we paste first those pairs one member of which is interior to B. In this model - 
the north and south poles of ® paste together to make w,.. Now the proof that 
P ~ 0 in Q proceeds as it did for II. In technical terms, we have just constructed 
and then used a simply connected covering space ® for Q. 


5. Untangling cords. To exploit the fact that P~ —P, and hence that 
P@®P~0O in Q, we must model Q and loops in it one more way. Consider two 
concentric spheres; call the inner one the globe (or the wrench, or the neutron) 
and the outer one the edge of the universe. Suppose the distance between the spheres 
is 1. Cords, as many of them as we wish to attach, lie initially along radii joining the 
globe to the edge of the universe. Pack the space between the globe and the 
edge of the universe with concentric spherical shells 2%, where te[0, 1] meas- 
ures the distance of %, from Xo. Each cord is attached to 2, where they meet. 


1973] THE SPINOR SPANNER 983 


Imagine that the shells can slide relative to each other. Let R be any loop in Q start- 
ing and ending at wo; suppose we manipulate the globe x, so that at time t¢ it is at 
R(t). Then the cords cause the intermediate shells to record R: at time ¢, 2, is in 
configuration R(t) (%,). A homotopy R~Q of paths in Q is a function 
Ff: [0,1] x [0,1] > Q satisfying the conditions listed earlier. If we now manipulate 
the shells 2, so that at time s, shell 2, is at position f(t, s) (2,) we shall have deformed 
the cords, which initially recorded R, to a record of Q. Thus when P is the loop 
corresponding to a full turn about an axis the homotopy P @ P ~0 really tells us 
that our Cords can be untangled, even if we started with many more than three. 

Because P ~ 0, no manipulation of the intermediate shells can untangle the 
cords after one full turn. It is true and slightly subtler that they cannot be untangled 
at all [6]. 


\ Ls) 
O ea 


Fic. 4 


Let us close by seeing just how the particular homotopy we have studied untangles 
the cords after two full turns. To convert P @ P to 0 we first deform the second sum- 
mand P to —P, or, in other words, deform the result of a full right handed turn 
about Lto the result of a full left handed turn. We do that by rotating L, the axis 
of the turn, in the subspace A of B, so that it reverses its direction. In Fig. 4 we sketch 
what happens to one of the cords between 2, and Z,,., the one which lies initially 


984 E. D. BOLKER 


along axis AA’. When the globe executes a full turn about L,; the cord assumes 
position i. As i varies from 1 to 4, L,; rotates counterclockwise through z radians in 
the plane of the paper. That operation simultaneously loops the pictured cord 
and all others on the right over and behind the wrench and those on the left under 
and in front. That is easier to do than to describe: try it. It really untangles cords. 
With a little practice it makes a good lecture demonstration or conversation piece, 
a magic trick which is not magic, but which reflects a fundamental yet little known 
property of the space in which we live. The analogy between the spinor spanner 
and the neutron suggests that the state of the latter depends not only on-its position 
and momentum but on which of two topologically distinct ways it is tied to its sur- 
roundings. A full turn about an axis leaves its position and momentum unchanged 
but reverses its topological relation to the rest of the universe. 


Acknowledgments. Some of the ideas in this paper I explored in conversation with M. Artin, 
F, C. Cunningham, Jr., M. Gaffney, A. M. Gleason, N. Stein, and N. Ramsey. I am indebted to the 
Bryn Mawr College Chapter of Sigma Xi and to West Chester State College, West Chester, Pa., 
where I spoke on the spinor spanner. Final thanks go to Ms. Jessica Bolker, who built the large model 
wrench I turn while I talk. 


References 


1. Y. Aharonov and L. Susskind, Observability of the sign change of spinors under 2z rotations , 
Phys. Rev., 158 (1967) 1237-8. 

2. H. J. Bernstein, Can 360° rotations be detected? Scientific Research (McGraw-Hill’s News 
Magazine of Science), Vol 4, No. 17 (August 18, 1969) 32-33. 

3. P. A. M. Dirac, in conversation on October 30, 1972, remarked that his initial demonstration 
model was a pair of scissors, to which it is easy to attach the cords. 

4. E. Fadell, Homotopy groups of configuration spaces and the string problem of Dirac, Duke 
Math. J., 29 (1962) 231-242. 

5. W.S. Massey, Algebraic Topology: An Introduction, Harcourt Brace and World, New York, 
1967, chapters 2 and 5. 

6. M. H. A. Newman, On the string problem of Dirac, J. London Math. Soc., 17 (1942) 173-177. 

7. A. H. Wallace, An Introduction to Algebraic Topology, International Series in Pure and 
Applied Mathematics, Pergamon Press, New York, 1957, Chapter iv. 

8. D. A. Adams, Apparatus for Providing Energy Communication Between a Moving and a 
Stationary Terminal, U.S. Patent 3,586,413, June 22, 1971. Copies available from the U.S. Patent 
Office or from Mr. Adams at 7434 E. Montecito Drive, Tucson, Ariz. 85710. 

9. M. Gardner, New Mathematical Diversions from Scientific American, Simon and Schuster, 
New York, 1966, Chapter 2. 


LINEAR COMBINATIONS OF SETS OF CONSECUTIVE INTEGERS 
D. A. KLARNER, Stanford University, and R. RADO, University of Reading, England 
Dedicated to Paul Erd6s on his sixtieth birthday. 


Let k — 1, m,,---,m, denote positive integers such that m,,---,m, have greatest 
common divisor 1, and let t denote an integer. A well-known result in the elementary 
theory of numbers is that the equation 


(1) MX, te + MX, = 1 


has infinitely many solutions in integers x,,---,x,. Furthermore, there exists an 
integer o(m) which depends on m = (m,,---, m,) such that (1) has a solution in non- 
negative integers x,,°--,x, for all t 2 o(m), but no solution of this kind exists when 
t = o(m) — 1. In this note we prove a refinement of this result by showing that a set 
of consecutive integers can be obtained by allowing the x; in (1) to range over suitable 
sets of consecutive integers. For example, every number t with 6<t<11 can be 
expressed in the form 3x + 4y with Ox <3,0< y <2. Later on we express facts 
like this by writing 


(2) (6, 11] < 3[0,3] + 4[0, 2]. 


The following notation is used: I, N, and P denote the set of all integers, the set 
of all non-negative integers, and the set of all positive integers respectively. Also, for 
any pair of elements i,jeJ, define [i,j] = {x:xeJI, is x <j}; furthermore, given 
sets I,,---,J, <I together with elements m,,---,m,éJ, define 


(3) midi,+-: +m,l, = {m4xX,+ °° +m,X,: x, E1; (i =1,---,k)}. 


For each ke P and J <I, let J* denote the set of all k-dimensional vectors over J; 
next, for elements X, pe I* with ¥ =(x,,-+:,x,), J =(V15°", Y,) define the usual dot 
product x-y=x,y,+-:-+x,y,; finally, define *< jy whenever x;<y, for 
i=1,---,k, and define * < y whenever x; S y; for i=1,-:-,k. 

Our main result may be succinctly stated in this notation as follows. 


THEOREM 1. Suppose k —1,m,,-::-,m,¢P and m,,:--,m, have greatest common 
divisor 1; let m=(m,,-:-,m,) and m= max{m,,---,m,}; suppose i,veI* satisfy 


(4) 5-i=(m—1,--,m—1), 

(5) m:(b — i) > 2(m — 1)(m, + --- + m,). 

T hen 

(6) [m-i + o(m), m-d — o(M)] S m,[uy,0,] + + m,[u,, 0; ], 


where ti = (uy4,-°+,U,), 0 = (04, °+-, 0), and a(m) is the function defined after (1). 


985 


986 D. A. KLARNER AND R. RADO [November 


Before proving Theorem 1, we shall state and prove a result dealing with the 2- 
dimensional situation which is sharper than the result provided by taking k =2 in 
Theorem 1. Furthermore, the proof of Theorem 2 gives some insight for the proof of 
Theorem 1. 


THEOREM 2: Suppose m,,m,€P such that m, and m, are relatively prime; also, 
SUPPOSE Uy4,U2,0,,0,E1 such that vy-—u,2m,—1, v, —u,2m,—1. Then 


(7) [muy + m2uU, + (m, — 1)(m, — 1), mv, + mv, —- (m, —1)(m,—- 1)] 


Sm,[u,,0,] + m[u2, v2]. 


Proof: It is well known that o(m,,m,) = (m, — 1) (m, — 1), where o(m,,m,.) — 1 
denotes the largest integer not expressible in the form m,x + m,y with x, ye N. Let 
m=(m,,m,), 4=(u,,u,), and 0 =(v,,0,), then it follows from the definition of 
o(m) that 
(8) m:i+o(m+N oCm,(u,+N)+m,(u,+N), 

(9) m:v —o(m)—N << m,(v, —N)+m,(v, —N). 
Hence, the intersection of the sets on the left in (8) and (9) is contained in the in- 


tersection of the sets on the right in (8) and (9). That is, 


(10) [m- + o(M), m-5 — o(M)] 
S (m,(u, + N) + m,(u, + N)) A(m,(v, —-N) + m2(v2 — N)). 


Now we prove a remarkable identity which gives a valid instance of intersection 
distributing over addition. 


(m,(u, + N) + ma(uz + N)) A(m,(v, — N) + ma(v2 — N)) 
= m,((u, + N) O(v, — N)) + m((u, + N) A(v,—N)). 


(11) 


Of course, the set on the right in (11) is just 
(12) m,[u1,01] + m[u2, 02], 
so (10), (11), and (12) combine to imply (7). It remains to prove (11). 

Consider the set of points I x I in the Cartesian plane. The subsets (u, + N) 
x (u, + N) and (v, — N) x (v, — N) of I x I lie in upper and lower quadrants of 
the plane whose intersection contains the set [u,,v,] x [u2,v2]. This situation is 
illustrated in Figure 1. We want to study the linear form m,x + m,y evaluated over 
all points (x, y)¢I x I; in particular, we are interested in points which have equal 
evaluations. Given an element h eI, the set L, of all points (x,y) eI x I such that 
m,x +m,y =h is situated on a unique line having slope — m, /m,. Also, it is easy 
to see that if (x’,y)e(U x DOL,, then L, = {(x’ + jmz,y’ — jm,): jel}. 

To prove (11), note that the set on the right is contained in the set on the left; 


1973] LINEAR COMBINATIONS OF SETS OF CONSECUTIVE INTEGERS 987 


suppose the reverse is not true. From this assumption we shall deduce a contradiction. 
Under this assumption it follows that there exists an h eI such that L, has points in 
common with both 


U = ((uy + N) x (uz, + N))\ (41, 01] x [u2,02)) 
and 


V= ((v; —N)x (v, —_ N))\ ([e4, 01] x [u2,02]), 


but L, has no point in common with B = [u,,v,] x [u2, 02]. 


° U 


(v4, 02) 
ee ®@ 


e mx + moy = h 


Fic. 1. The set of points (uj+N) X (42+N) lies in the quadrant above and to the right of the 
point (41,42), the set of points (v;}—N) X (v2—N) lies in the quadrant below and to the left of the 
point (v1,v2), and the set of points [41,01] < [u2,v2] lies in the box. 


Suppose (x’, y’)e L, OU and (x”, y”)e L, OV; since (x’, y’) € B, either x’ < u, or 
y’>v,. If x’ < u,, then x” > v, because (x’, y’), (x", y")E L, and (x”, y”) € B. In this 
case we suppose (x’, y’) has been selected from L, QU so that x’ is maximal, and 
(x”, y”) has been selected from L, Q V so that x” is minimal. Since (x’, y’), (x”, ye L,, 
and L,\B=(@, we must have x”—x’=m,. But, x’<u, and x” >v, implies 
x’ +1Su, and x”—120,; hence, m, -—2 =x" —x'’ —2 20, — u,, contradicting 
the hypothesis v, — u, 2 m, — 1. In the case y’ > vg, it follows that y” < u,. This 
time the points (x’, y’) and (x”, y”) are selected so that y’ is minimal and y” is maximal. 


988 D. A. KLARNER AND R. RADO [November 


The argument goes just as before; we must have y’ — y” = m, which leads to the 
contradiction v, —u, Sm, — 2. This completes the proof of Theorem 2. 

Now we prove Theorem 1. To do this, we prove an identity having the form of 
(11), but subject to the conditions (4) and (5). 


Lemma. If k-dimensional vectors m, ii, and 0 satisfy the hypothesis of Theorem 1, 
then 


k k k 
(13) Xu mut NA XL mo,—N)= L m(u;+N) A; —N)). 
i=l i=1 i=1 


Theorem 1 is an immediate consequence of the Lemma; its application is the 
justification of the penultimate equality in the following string of formulas. 


[m-i + o(m), m-v — o(m)| = (M-i + o(M) + N) N(M-5 — o(M) — N) 
c x mu, + Nya > mv; — N) 
(14) i=1 i=1 


I 
M = 


m((u; + N) A(o;, — N)) = . xX miu;, v;]. 


To prove Theorem 1 completely, it remains to prove the Lemma. For each ie], 
let L, = {X: xel*; m-X =i}, and suppose the Lemma is false. Then there exists 
hel such that L, NU, L, V4 @, but L, 1 B= @ where 


U = {e: el", §2a}\B 
V = {¥: ¥eI", XS 0}\B 


B 


[Uy,0y] x + x [uy]. 


Suppose x’ < U is selected so that 


k 
(15) X max {v,,x;! 


~ 
HT 
= 


is minimal, where <’ = (xj,---,x;,). Since *’¢B, there exists re[1,k] such that 
x, > v,. Furthermore, there exists se[1,k] such that x, < v, since otherwise x’ > 5, 
which implies h=m:x'’>m-x for all x<d, contradicting the assumption 
L, AV# @. Of course, r 4s, so we have 


k 


(16) h= X mx; + m(x,—m,) + m(x, + m,); 
ltrs 
(17) X,—mM,—u, 2 (v,+1)—m,—u,=(v, — u,) —m, + 1 
2 (v, — u,)-m+120. 


1973] LINEAR COMBINATIONS OF SETS OF CONSECUTIVE INTEGERS 989 


Hence, by the minimality assumption made in (15), 
(18) max {v,, x, — m,} + max {v,,x, + m,} = max {v,,x,} + max {v,, xz}. 


Hence, 
max {v,, x, + m,} > max {v,, x} = 0,; 


(19) X,+M,>v,5 Xx, >v,—m,2v,— mM. 
This implies 
(20) xX’ > —(m,---,m). 


Suppose x” € V is selected so that 


k 
(21) x min {u,,x;} 
i=1 
is maximal where xX” = (x;,---,x;,). Now an argument running parallel to (15)-(21) 
can be given to show that 


(22) x" <i +(m,---,m). 
Together (20) and (22) imply 


k 
0= mxX'—mxX"> YD m(v,—m+1)—-—(u;+m-—1)) 
i= 


(23) 
= m:(6—uz)—2m—-1) L m,. 


i=1 
But (5) implies 
k 
(24) m:(6 —u)—2m—1) X& m,>0, 


i=1 
so (23) provides the required contradiction, and we conclude that the Lemma is true. 
The results proved in this paper arose in connection with our investigation [1] of 
the smallest set <m-xX:1> < P containing 1 which is closed under the operation 
m: xX, where m =(m,,---,m,) is a given k-tuple of relatively prime positive integers. 


This research was supported by the Office of Naval Research under contract number N-00014—67—-A- 
0112-0057 NR 044-402, and by the National Science Foundation under grant number GJ-092. 
Reproduction in whole or in part is permitted for any purpose of the United States Government. 


At the time this paper was written, R. Rado was Visiting Professor at the Faculty of Mathematics, 
Department of Combinatorics and Optimization, University of Waterloo, Ontario, Canada. 


Reference 


1. D. A. Klarner and R. Rado, Arithmetic Properties of Certain Recursively Defined Sets, to 
appear. 


THE EQUATION x’ (¢) = ax (t) + bx (t — 1) WITH “SMALL” DELAY 


R. D. DRIVER, University of Rhode Island, 
D. W. SASSER, Sandia Laboratories, Albuquerque, 
M. L. SLATER, Texas Christian University 


One of the simplest examples of a delay differential equation is the linear scalar 
equation 


(1) x'(t) = ax(t) + bx(t — 1), 


where a, b 4 0, and t > O are real constants. As is well known and easily proved, for 
every given function ¢ € C({ — t, 0], R), there exists a unique function x € C([ —t, 00),R) 
which satisfies the intial condition 


(2) x(t) = d(t) for te[ — 7,0] 


and which satisfies Eq. (1) for t > 0. We shall call this function x the solution of 
Eq. (1) with initial condition (2) or, more briefly, the solution of Eqs. (1) and (2). 

Equation (1) occurs in a number of applications. For example, a certain model 
for population growth gives rise to the nonlinear equation 


x(t) = — cx(t — 1)[1 + x(d]. 


Here the population is proportional to 1 + x(t). The same nonlinear equation has 
even arisen in the study of the distribution of prime numbers. The stability of the 
trivial solution of this nonlinear equation depends upon the stability of the trivial 
solution of its linear approximation 


x'(t) = — cx(t — 1), 


a special case of (1). See, for example, Wright [11]. 
Another equation which has been proposed as a model for population growth, 
and also for gonorrhea epidemiology, is 


x'(t) = g(x(D) — gx(t — L)). 


The linear approximation to this equation is Eq. (1) with a + b = 0. See Cooke and- 
Yorke [2]. 

As a third application, consider the problem of mixing of salt brines encountered 
in any elementary differential equations text. The usual example assumes that an 
inflowing salt solution is instantaneously perfectly mixed with the brine in the tank. 
The mixture simultaneously flows out at the bottom of the tank at the same rate as 
the inflow. The resulting differential equation for the amount of salt in the tank is 
y(t) =k —cy(t). But if one eliminates the assumption of instantaneous perfect 
mixing, he is naturally led [4] to the equation 


y(t) =k—cy(t —7). 


990 


A DELAY DIFFERENTIAL EQUATION 991 


It now suffices to introduce x(t) = y(t)— k/c and the equation becomes x’(t) = 
— cx(t — tT), a special case of (1) again. 

The problem represented by Eqs. (1) and (2) has received much study. The known 
results for this problem include series expansions of the solutions, due to Schmidt [9] 
and others, (see Bellman and Cooke [1] or El’sgol’ts [5]); a detailed study 
of the asymptotic behavior of solutions by MySkis [6]; and an asymptotic 
characterization of the solutions in case the delay is ‘“‘small’’, due to Rjabov [8] 
(see also [3]). Actually equation (1) is merely a simple prototype of the various 
equations considered by these authors. 

The present note, restricted to (1), shows how some of the known results can be 
simply obtained using elementary calculus. When the delay t is small, we shall 
prove that certain similarities exist between the solutions of Eq. (1) and those of-an 
equation without delay. 


THEOREM. Let 


1 _ 
(3) — > < bte Ze, 


Then, in the real interval (a — 1/t, 00), the characteristic equation 
(4) A=a+be** 
has a unique solution 4. Moreover 4 <a-+1/t. And, if 4 is this particular solution 


of (4), and if x is the solution of Eqs. (1) and (2), then 


(5) lim [x(t)e"“] = I 


0 1+ bte~*" 


0 


e““o(s) ds], 


[o(0) + be™** | 


the limit being approached exponentially. 


REMARK. For an equation with several delays, 


x'(t) = ax(t) + b> b x(t — T,), 


j= 


where 0 <7, S71 for j =1,---,m, a similar result holds under the hypothesis 


t oD |b ere <1. 
j=l 
For Eq. (1), however, this condition is much stricter than (3). The case of infinitely 
many distributed delays has been treated elsewhere [4]. 
Proof of the Theorem. To analyze the characteristic equation (4), let us consider 
the function D, defined by D(p) = p — a — be-”*. It follows from the first inequality 
of (3) that 


992 R. D. DRIVER, D. W. SASSER AND M. L. SLATER [November 
Again using the first inequality of (3), we find that for all p2=a—1/t 
D'(p) =14 bte~?? > 1— e**-te-?* > 0. 


Since lim,..,. D(p) = ©, it follows that there is a unique 4>a—1/t such that 
D(A) = 0. 
Invoking the second inequality of (3), we find that 


T 


D(a+ -) = —— be! 50, 


Thus it follows that 


This in turn enables us to estimate 


(6) | bte~** 


= JA—alr<1. 


Now define y(t) = x(#)e~*‘ and find the equations, equivalent to (1) and (2), for y: 


(7) y(t) = — be~**[ y(t) — v(t — 1)] for t > 0, 
with the initial condition 
(8) y(t) = d(Ne~** for —t St SO. 


These equations, in turn, are equivalent to 


t 
(9) y(t) = — be | y(s)ds + C for t20 
t-t 
with (8), where 
0 
(10) C = ¢(0) + be | d(s)e“*ds. 
Since 1 + bre-** > 0, we can define 
C 
t)= t) — ——_—_—_— 
z(t) = y(t) ia beer 
and get another equivalent problem: 
t 
(11) z(t) = — be | z(s)ds for t=0 
t~Tt 
with 
(12) z(t) = o(t)e7** — C ~ for ~-tSts0. 


1+ bre-+ 


1973] A DELAY DIFFERENTIAL EQUATION 993 


Let M be an upper bound for | z(t) | on [ —7,0]. Then we shall show that 
z(t)| <M for all t2—t. Given any e>0, suppose (for contradiction) that 
z(t)|<M+efor —tSt<t, and | 2(t4) | = M+e. Then, using (6) for the first 
time, we find 


M +2@=(|2(t,)| <|be*" UM+8<M+e6, 


{. | 2(s)| ds < | be~*" 
f14—T 


which is nonsense. Thus | z(t) | <M +e for all t= —t, and, since ¢ was arbitrary, 
|z(t)| << M for all t= —c. 

It now follows, by an easy induction, that | 2(4) | < | bte-**|"M for allt >nt—t 
(n = 0, 1,2, ---), and hence z(t) > 0 exponentially as t > oo. This is equivalent to (5). 


As has already been indicated, the result of this theorem is not really new. It can 
essentially be found, using Laplace-transform and residue-theory methods, in Chapter 
4 of [1] or Chapter 2 of [5], for example. A similar qualitative result is obtained for 
much more general equations by Myskis [6] (Chapters 3 and 4) using comparison 
techniques, and by Rjabov [8] and Uvarov [10] (see also [3]). However, in the 
latter works, the value of the limit of x(t)e~*' (the right hand side of (5)) is not 
obtained. 

The thing which is apparently new here is the simple proof of equation (5), using 
nothing but elementary calculus. 

The theorem provides the asymptotic behavior of x(t), except in the case when 


0 


(13) (0) + be | e~*5$(s) ds = 0. 

And a randomly-chosen function ¢¢C([— 1,0], R) will rarely satisfy (13). More 
precisely, one can show that equation (13) is satisfied only when @ belongs to a certain 
nowhere dense subset of C({ — 1,0], R) with the sup norm. 

M. J. Norris [7] has independently considered an equation like (1) but with 
infinitely many (distributed) delays. Assuming negative coefficients and a sufficiently 
small maximum delay, he has determined a two-term asymptotic representation; and, 
for the special case of (1), his proof is also quite elementary. 

We conclude by giving, as corollaries, some easy consequences of the theorem 
proved here. 


COROLLARY 1. If — 1/e<bte~% <e, then equation (1) has no oscillatory solution 
except possibly in the unlikely case that (13) holds. 


COROLLARY 2. Let — 1 /e < bte~“* < e. Then 
(ij) A<Owhenevera+b<Oand at <1, 

(ii) A=0whenevera+b=0 and at <1, 

(iii) A > 0 whenever eithera+b>0orat>1. 


994 R. D. DRIVER, D. W. SASSER AND M. L. SLATER [November 


The trivial solution of (1) is (uniformly) asymptotically stable in case (i), (uniformly) 
stable in case (ii), and unstable in case (iii). 


Proof. Whenever at <1, it follows that bt > — e**-! > —1. Referring to the 
proof of the theorem we find: 

Gi) Ifa+b<0 and at <1, then D(0)>0 and 0>a—1/t. Thus 1 <0. 

Gi) Ifa+b=0 and at <1, then D(O) = 0 and 0>a—1/t. Thus 1 =0. 

(iii) If a+ b>0, then D(O)= —a—b<0O. Thus 1>0. 
And if at > 1, then a —1/t>0. Thus 42> 0 again. 

The three stability assertions now follow easily from equation (5). 


COROLLARY 3. Let —1/e < bte ““ < e andat <1. Then the asymptotic behavior 
of solutions of (1) (assuming (13) is not satisfied) is qualitatively the same as that 
of the ordinary differential equation obtained by either ignoring the delay, 


y(t) = ay(t) + byt), 
or by approximating x(t — t) with the first two terms of a Taylor’s series 
y(t) = ay(t) + by(t) — bry"). 


The various regions of the (a, b)-plane, mentioned above, are indicated in the 
following figure. A complete stability diagram can be found in El’sgol’ts [5], p. 56 
(where our a and b are replaced by —a and —b). 


Mstabiliws 


asymptotic stabilitv BS 


1973] THE QUATERNION CALCULUS 995 


COROLLARY 4. Let a+b=0 and —1<at<1. Then the solution x of (A) 
and (2) approaches a limit as t > o: 


1 0 


Presented to the American Mathematical Society, January 17, 1972 at Las Vegas. This work was 
partially supported by the United States Atomic Energy Commission. 


References 


1. R. Bellman and K. L. Cooke, Differential-Difference Equations, Academic Press, New York, 
1963. MR 26 # 5259. 

2. K. L. Cooke and J. A. Yorke, Equations modelling population growth, economic growth, 
and gonorrhea epidemiology, Ordinary Differential Equations, Academic Press, New York, 1972, 
35-53. 

3. R. D. Driver, On Ryabov’s asymptotic characterization of the solutions of quasi-linear differ- 
ential equations with small delays, SIAM Rev., 10 (1968) 329-341. MR 38 # 2410. 

4. , Some harmless delays, Delay and Functional Differential Equations and their Appli- 
cations, Academic Press, New York, 1972, 103-119. 

5. L. E. El’sgol’ts, Introduction to the Theory of Differential Equations with Deviating Argu- 
ments, Holden-Day, San Francisco, 1966. MR 33 # 381. 

6. A. D. MySkis, Linear Differential Equations with Retarded Argument (Russian), GITTL, 
Moscow, 1951. MR 14-52. 

7. M. J. Norris, unpublished notes. 

8. Ju. A. Rjabov, Certain asymptotic properties of linear systems with small time lag (Russian), 
Trudy Sem. Teor. Differencial. Uravnenii s Otklon. Argumentom Univ. DruzZby Narodov Patrisa 
Lumumby 3 (1965) 153-164. MR 35 # 1895. 

9. E. Schmidt, Uber eine Klasse linearer funktionaler Differentialgleichungen, Math. Ann., 70 
(1911) 499-524. 

10. V. B. Uvarov, Asymptotic properties of the energy distribution of neutrons slowed down in 
an infinite medium (Russian), Z. Vyéisl. Mat. i. Mat. Fiz., 7 (1967) 836-851. 

11. E. M. Wright, A non-linear difference-differential equation, J. Reine Angew. Math., 194 
(1955) 66-87. MR 17-272. 


THE QUATERNION CALCULUS 


C. A. DEAVOURS, The Cooper Union of New York 
(Current address: Newark State College, Union, N.J.) 


1. Introduction. Most students, upon completing a first course in complex 
analysis, have glimpsed the immense power and elegance of the subject, particularly 
in treating two dimensional physical problems. The question then arises as to whether 
an analogous calculus exists for three dimensions. Lack of an appropriate hyper- 
complex number system seems to prevent any attempt along this line from going 


Cipher Deavours received his University of Virginia Sc. D. under Gordon Latta. Since then he 
has been at The Cooper Union. His main research interest is ordinary differential equations. Editor. 


996 C. A. DEAVOURS [November 


very far. Nevertheless, there exists an extensively developed four dimensional cal- 
culus, little known in this country, which was developed by R. Fueter [1] in the 
decade following 1935. A good bibliography to papers on the subject is found in [2]. 
Rose’s work on quaternion velocity potentials for axisymmetric fluid flow appears 
to be the only paper on the subject to appear in English. 

Fueter defines both right and left regular functions of a quaternion variable and 
develops the associated theory by producing analogues of both Cauchy Theorems, 
Liouville’s Theorem, and Laurent series developments. In quaternion [4] Abelian 
functions having four periods are constructed and their properties studied. 

Some of the essential aspects of Fueter’s calculus will be discussed in this paper, 
using a somewhat diiferent approach. The author has found that selected topics 
from this subject provide excellent optional topics for courses in complex variables, 
especially for the more enquiring students. Once acquainted with quaternions 
students often guess and prove theorems analogous to those which they have 
recently learned in the course. Science and engineering students gain greatly from 
such exposure as the use of quaternions provides them with a “‘concrete’’ example 
of an algebra more complicated than that of ordinary complex numbers. (Students 
never seem to view matrices in this manner.) 

The compact quaternion form of Maxwell’s equations which has been dis- 
covered repeatedly by undergraduates over the years is included along with several 
other topics of classroom interest. 


2. Quaternions. Quaternions were invented in 1843 by the Irish mathematician 
William Rowen Hamilton after a lengthy struggle to extend the theory of complex 
numbers to three dimensions. An account of Hamilton’s ultimate rejection of the 
commutative law of multiplication and the ensuing quaternion wars which raged 
afterwards is to be found in [5] and [6]. 

The algebra of quaternions has the distinction of being one of the three associative 
division algebras possible. Linear combinations are formed of the four units 1, 
i, j, k using coefficients taken from the real number field. The quaternion thus 
formed, w + xi + yj + zk will be denoted q or w+ r, where r is the usual radius 
vector of three dimensions. The w component of q is called the scalar part of the 
quaternion and r is termed its vector part. Quaternion addition and scalar multiplica- 
tion are defined in the usual manner as to constitute a linear algebra. The symbol 1 
behaves as the ordinary number one in multiplication while the other units satisfy: 
i’? = j*?=k*=—-1, ij=k= — ji, jk =i = —kj, ki = j = —ik, Products 
of quaternions are formed using the above rules and the distributive law. Thus 


(a+ A)(b+ B) = ab—A-B+aB+bA+A~x B, 


where the dot and cross indicate the usual three dimensional scalar and vector 
cross products respectively. For any quaternion q = w+r there exists a conjugate 
quaternion, q = w — Fr, satisfying qq = qq = w* +x? + y?+2z” =|q|?. The non- 


1973] THE QUATERNION CALCULUS 997 


negative quantity |q| is termed the norm of q. The conjugation operation satisfies 
the equation AB = BA. Quaternion multiplication is not commutative but all 
other algebraic properties of the real and complex numbers hold. 

The skew field of quaternions is isomorphic to a subset of 4 by 4 matrices under 
the mapping: 


w x y Z 
—x w --Z y 
q-> 
—y Z Ww —x 
—Z —y x w 


or to a set of 2 by 2 complex matrices related to the Pauli spin matrices [8]. The 
topological properties of the quaternion group are discussed in | 7]. We shall consider 
functions of a quaternion variable q which will be written F(q); such functions 
can be decomposed into a scalar and vector part which we shall write as F(q)=@+W. 
The vector part of F will be expressed in component form as W = W,i1+ W,j + 3k. 
Generally, the four components of F will be required to possess continuous partial 
derivatives up to a certain order, usually first or second, for our proofs to hold but 
we shall not belabor this point. 

In the sequel, D is a simply connected domain of E* with subdomain to having 
as its boundary the closed hypersurface do. Volume elements of o are denoted dV 
while the (quaternion) oriented, outwardly directed surface elements of do are 
denoted dQ. Introducing the quaternion gradient operator 

4) 4) 4) 7) 0 


LL] — gw t VY say tla t dat kar 


we have the following useful result. 


THEOREM 2.1, Let F= 6+ Ww be a function of the quaternion variable q =w +r, 
then 


a 


(1) { (d@)F = | Fay. 


Proof. Equation (1) is a quaternion form of the Gauss divergence theorem for 
four dimensions. Let dQ = dQ, + dQ,i+ dQ,j + dQ,k. If M is the matrix 


p Wy Wo Ws 


998 C. A. DEAVOURS [November 


and [dq| = (dQo,dQ,,dQ2,dQ3) is a row vector having the same components as 
the quaternion dQ, then, the matrix product [dq|M is a row vector with the same 
components as the quaternion product dQF. By the Gauss Theorem 


| [dq|M -| div(M) dV, 


where the matrix divergence of M is to be taken. 
It is readily verified that div(M) is a row vector whose four components are the 
same as those of the quaternion 


0 
CIF (5 + v)@ +W) 
(2) 
_ 0g ow 
= gy t WO +a ViwiV xy, 
which establishes the result. 
Similarly, we may demonstrate the alternate form of this result: 


fa) 


[Fee = | Fdv, 
0a a 


where the gradient operator is understood to act on the function F to its left. 


3. Regular quaternion functions. In seeking to construct a differential and 
integral calculus of quaternion functions the first step would seem to be definition 
of a derivative. A (right) quaternion derivative of the function F might be formed by 
requiring the limit 


dF/dq = lim(F(q + Aq) — F(q))/Aq 


to exist as Aq— 0 and be independent of path for all increments Aq. By considering 
four linearly independent increments Aw, Axi, Ayj, Azk one can derive a set of 
over-determined partial differential equations to be satisfied relating the components 
of F under such conditions. This approach leads to nothing productive since, even 
for the simple function q’, the ratio of AF to Aq is not independent of Aq, as was 
first observed by Hamilton [9]. The best one can do is to define scalar directional 
derivatives under the definition 


d,F = lim(F(q + en) — F(q))/e 


with é€ real, e > 0, and na unit quaternion in the desired direction. The vector Taylor 
series expansion theorem in any direction can be then obtained but no real calculus 
results since only directionally dependent quantities are encountered. These ideas 
were first put forward by Hamilton himself in his Elements of Quaternions, [9]. 

To avoid the above difficulties, a weaker condition than path independence of 


1973] THE QUATERNION CALCULUS 999 


the differential ratio must be adopted. For a continuous function of the complex 
variable z = x + iy, the assertion 


| f(z)dz = 0 


for every closed contour, C, in a domain of the z-plane is equivalent to the regularity 
of f in that domain (Morera’s theorem). An alternate approach which suggests 
itself is the following. 

A function F of the quaternion variable q is said to be left regular in D if 


(3) | dQF = 0 
da 


for every closed hypersurface, da, in D. 

A right regular function is defined in similar manner by requiring the vanishing 
of the integral {,,F(q)d@Q under the same circumstances. The following properties 
of regular functions are easily established. 


LemMA 3.1. If F(q) is right (left) regular in D and qo is a constant quaternion, 
then F(q — q,) is also right (left) regular in D, 


LEMMA 3.2. If F is right regular in D and G is left regular in D then 
Jae FdQG = 0 for any closed hypersurface, dc, in D. 


THEOREM 3.1. The function F = @ + is left regular in D if and only if 


Op ig. 
@) c= Vy 
(5) ve = ~Y_uxy. 


Proof. This result follows directly from (1) and (2) since (4) and (5) are equivalent 
to the single quaternion equation []F = 0. 

The equations satisfied by the components of a right regular function are identical 
to (4) and (5) with the sign preceding the cross product in (5) changed to plus and the 
identical to the quaternion equation F(] = 0. If a function is simultaneously left 
and right regular or, briefly, regular, then V x W = 0 and W is the gradiant of a scalar 
potential function, y = V®. In this case, (4) and (5) are replaced by 


Op Ob\ 
Sw 788 V(b + 5) =O, 


where A denotes the three dimensional Laplacian operator in x, y and z. These last 
two equations have some application in the study of fluid flow [3]. 


COROLLARY 3.1.1. Each component of a left or right regular function satisfies 
Laplace’s equation in the four variables w, x, y and z. 


1000 C. A. DEAVOURS [November 


Proof. Taking the divergence of both sides of (5) we obtain 


ap = —V- = Sy) 
From (4) 
__ §(6¢\__ &% 
Ap = alas} Ow" 
Thus, 


rag 


as required for the scalar part of F. From (5) we derive 


ory 7) 4) 
~ Gwe = Bw XY + ay VO 


—V x (Vd@+V x W)+ WV: W) = AV, 


so that 


As might be expected, given a scalar function ¢ sufficient differentiability, a vector 
function can be found so that ¢+W constitutes a regular function of q, [1]. 
Due to the well-known maximum principle for solutions of Laplace’s equation we 
have the following analogue of Liouville’s theorem. 


COROLLARY 3.1.2. The only quaternion function regular with bounded norm 
in all E* is a constant. 


The concept of regularity may be extended to include functions regular in q. 


DEFINITION. A function F = ¢ + is said to be left regular in q for a domain 


D provided 
| dQF = 0 
0a 


for every closed hypersurface, do, in D. 


Right regularity in q is defined in the obvious manner through the vanishing of 
Jog FdQ in D, Necessary and sufficient conditions for F to be left (right) regular 
in q are (|)F = 0(F [J = 0) in D where 

4) 


= ay — ¥: 


Regular functions of q also satisfy Corollaries 3.1.1 and 3.1.2. A function, F, is left 


1973] THE QUATERNION CALCULUS 1001 


(right) regular in q only if its conjugate, F, is right (eft) regular in q. Further, the 
only functions simultaneously regular in both q and q are constants. 


4. Generation of regular functions. Under the foregoing definitions, one hopes 
that a norm convergent quaternion power series of the form 


ce 
x a,(q — qo)", 
n=0 
where the a, are constant quaternions, would be a regular function of q. Thus, for 
every regular function of the complex variable z one could generate an analogous 
regular function of q by formally replacing z by q in the power series expansion. 
It is the perversity of the quaternion calculus that even simple powers of g are not 
regular functions. For example, the scalar part of q? is w* — r- r which does not 
satisfy Laplace’s equation and hence cannot be regular in q. Nevertheless, there is 
a close connection between convergent quaternion power series and regular functions. 
We shall term quaternion functions defined by norm convergent power series to be 
analytic functions and shall restrict ourselves to power series with real coefficients. 
The formal device of replacing z by q in a series expansion can be carried out in 
a more systematic manner. Let f(z) = u(x, y) + iv(x, y) be a regular function of the 
complex variable x + iy in some domain. We generate a quaternion function F 
from f by replacing x with w, y with r = (x? + y? + z*)? andi with e, = r/r so that 


F(q) = u(w,r) + e,v(w, Pr). 


Since z" is replaced by q” this method yields the same result as the power series 
substitution. We inquire as to whether or not the function F thus generated is regular 
in qg. Instead of attempting to verify (4) and (5), we shall check the necessary 
conditions A,(u + e,v) = 0. We find that 


0? 0*u d*v 
+e 


Ow? (u + @,0) Ow2 ” Ow? ’ 
iaweny 2 20, 2H, y(1 OAL). 
BS) Or Br? ror re pt Ope 
Since 
Mu Ou oe oy 
Ow2 Ors usr?” 
then 
2 ou lL dv 1 
(6) A,(u + ev) = = = + 2(- ap a0) ey. 


The only functions generated in this manner whose components satisfy Laplace’s 
equation are constants or linear functions of g. Using du/dr = — dv/dw, we may 


1002 C. A. DEAVOURS [November 


rewrite (6) as 


0 [v 0 [v 
—2| ——_[- —{=\]. 
Aatu + @,) Ow (*) +e, <(°)] 
Since the special variables x, y, z only occur in the combination r, this result appears 
to be a special case of the more general equation 


(7) A,(u +e,v) = 2( - < + v} (*). 


r 
Since 


A,(u+e,v) = 2 Clu +ev) = — 24(°), 


we deduce that 


as may be readily verified. The equation (8) holds only for functions F, constructed 
in the above manner. If F is generated from a function regular in the complex variable 
Z = x — iy, the corresponding result obtained is 


(9) OF = 2- 
Equations (8) and (9) yield the relation 


= OF ov 
(10) AIF = 2(5 +2), 
which may be applied if F is generated from a regular function of z. 

The symmetry of the generating process shows that the generated function must 
be regular (both left and right) if it is either left or right regular and, therefore, 
must satisfy []F = 0. Equation (8) shows that functions generated from regular 
functions are not regular; however, 


v 1/070 67v 
a.(;) = aye + =) = 9 


We have proved the following: 


THEOREM 4.1. Jf Fis generated from a regular function f of z then the function 
A,F is a regular function of q. 


COROLLARY 4.1.1. The norm convergent series Xa,A,q" is regular in q. 


COROLLARY 4.1.2. Each component of a function F generated as above satisfies 
the biharmonic equation A,A,F = 0. 


1973] THE QUATERNION CALCULUS 1003 
COROLLARY 4.1.3. Let v be harmonic in w and r. Then the quaternion func- 
tion [\(v/r) is regular in q. 
THEOREM 4.2. Let F be generated from the function f regular in z then 


2 (OF  v(y,r) 
(11) A,F = 3° Se et). 


Proof. Applying the operator [] to both sides of (8), we find 


_ 2/— 
(12) OOF = A,F = -248(7) = ~ 5 (Gv +26). 
Since v may be written as v(w,r) = 4(e,F — e,F), we have 


(13) Oe =40eF/ -400e,P). 


The function e,F is generated from the function if which is regular in z while e,F is 
generated from if which is regular in z, so by (9) and (10), equation (13) becomes 


— _1(,¥ _1/,, OF u\)__, OF 
Ly =5 pr} 2°" éw rj} " Ow’ 


Equation (11) now follows from (12). Fueter’s two formulas for A,q"” and A,q " 
[1, p. 316] are special cases of (11). 


5. The Cauchy-Fueter integral formula. Cauchy’s integral formula expresses tte 
value of a regular function at a point interior to a closed contour in terms of the 
integral of its values on the contour. An analogous but more complicated theorem 
holds for regular functions of q. We shall need the following fundamental theorem. 


THEOREM 5.1. Let 60 be a closed hypersurface in E* containing the point qo, 
then 


0 n=0,1,-- 
(14) | A,(q — qo)"dQ = < 8n* n=—-1 
da 
0 n= —2, —3,- 


Proof. For n a non-negative integer A,(q — qo)" is regular in E* and the result 
follows directly from the definition of regularity. If n is a negative integer the desired 
results can all be obtained from the case n = — 1 by differentiation under the in- 
tegral sign with respect to the scalar part of qo. In fact, if 


(15) | Aa(q — qo) d@ = 87’, 
0a 


then 


1004 C. A. DEAVOURS [November 


3° ~1 o" —1 
Owe [asa a0 dQ = \. Aa Bye (I~ Fo) dQ 


= (k!) | A4(q — qo) Ode = Q, 
0a 


with qQg = Wo + fp and k = 1,2,---. 
All that remains is to prove (15). In view of Lemma 3.1 we need only to establish 


the case where q, = 0 and Co is a hypersurface enclosing the point q, = 0. Since 
~1 


A,q__ is regular except at q = 0, 
| A,q ‘dQ = -| Aq —17Q. 
da \q| =1 
Equation (7) can be used to find A,q~ *. Because v(w, r) = —r/p? where p? = w? +r’, 
then A,q’* = —(4/p?)q_*. The scalar surface element of a sphere having radius 


|q| in E* is |q|*dS, where dS is the surface element of the corresponding unit sphere 
in E*, [10, p. 677]. The oriented surface element for a sphere of radius |q| is therefore 


(16) dQ = |q|*qdS. 
The integral in question becomes 
-| Asa 'd@ = 4[ q ‘adS = 8n’, 
la] =1 v la|=1 


since the surface area of the unit sphere in E* is 2x7, [10, p. 677]. 

The previous theorem leads one to expect that the functions A,q” will play 
roughly the same role in the quaternion calculus that the functions z” play in ordinary 
complex analysis. Given a function F defined by a Laurent type series 


F(q)= 2 a,(q— qo)” 


we deduce formally 
AsF(q)(q- 40) = 2 aAa(q — 40)" 


from which we derive 


1 _ 
a,-1 = Fo) = S72 { A,F(q)(q — qo) “dQ. 


1? 
Of more interest is the following analogue of the Cauchy integral formula. 


THEOREM 5.2. Let F be a regular function of q in D. If 0a is a hypersurface in 
in D containing the point qo, then 


1973] THE QUATERNION CALCULUS 1005 


F(qo) = 3 |, F(q)dQA,(q —qo)*. 


Proof. For e small enough, the hypersurface centered at q, defined by 
|q —qo| = ¢ lies inside dc. In the region between the surface of the e-sphere and 
do both F and A,(q — q,) ‘ are regular so that, using Lemma 3.2 we can show that 


1 _ 1 _ 
=a | F(q)dQA,(q— 0) * = 5-5 F(q)dQA,(q — qo)’. 
87 da 87 la—qo| =e 
The surface element for the last integral is found by replacing g with q — q, in (16); 
thus, 


dQ = |q — 4o|7(q — 40) dS. 
The function F(q) is to have sufficient differentiability so that 


F(q) = F(qo) + O(1q — Go|), |4 — 40] 79. 


The limit of the last integral as e > 0 is therefore found to be 


_ 4 
lim 55 F(q)e"(q — qo)e 7(q — qo)” *dS 
8x la—qol =1 


= lims5 | (F(qo) + O(e))dS = F(qo) 
la-qo| =1 


as required. 

It is essential in Theorem 5.2 that the terms in the integral be separated by the 
differential dQ since F- A,(q—q,) ° is not generally regular even if F is. Many 
properties of regular functions such as the existence of series expansions, mean 
value theorems, etc., can be proved from (17) in much the same manner as is done in 
complex analysis. We give only one such example, the familiar Poisson integral 
formula for n = 4, [10, p. 265]. 


COROLLARY 5.2.1. Let F=@+ be regular inq for \q| < pand letqg=Wo+lo 
be a point such that |qo| = R, where p> R, then 


_ p*(p? — R*) p(q)dS 
(18) P40) = 572 _— (p? + R? — 2pRcos(6)) 
where cos(@) is defined by 
(19) cos(0) = (wWwo +r: ro)/|4| | qo]. 


Proof. From (19) 


1q|{4o| cos(@) = wwo +r° ro 


1a? + [do|” — |a — ao]? 


1006 C. A. DEAVOURS [November 


so that (18) becomes 


26 n2 __ R2 ds 
(20) (Go) = sie aes _ 
n la-qo| =1 14 — 4o| 
From Theorem 5.2, 
1 _ 
(21) F(qo) = 35 F(q)d@A,(q — qo) 
la-qo| =p 
_ it F(q)p*qdS(q — 40) ' 
2n* la-qo| =1 \q = 4o|’ 
If qg = p7qo° then |q9| > p?/R > p and 
1 _ 
(22) O= 35 | F(q)dQ(q — qo) 
m1 la-qo| =p 
_ it () p°q4Sq~"(4o — 9) Fo 
2m? J \q-aol =1 \P 14 — Go|” 
Dividing out the constant (R/p)* from (22) and subtracting (21) from (22) we 

find 

1 F(q)p” - =~ = \-15 

Fae) = 5 | ——~—s (4 — 90) * + (4 — Go) “Fo aS 
2x? Jiqeaot=t 14 — Go? ° “” 
_ pe —R? | F(q)dS 
2n? |@-4o| =1 \q 7 qo|* 


which proves the equation (20) and hence the theorem when the scalar parts of the 
last equation are equated. 

6. Applications. Aside from older mechanics texts which sometimes treat rotations 
in the quaternion form, they are seldom encountered except with their cousins 
octernions and Clifford numbers in the factorization of relativistic energy equa- 
tions, [11]. Instead of studying Laplace’s equation in 4 variables, one generally 
wants to consider the wave operator, 

1 0? 

c? Ot? © 
Formal replacement of w to ict changes one equation into the other. The equations 
for right regular quaternion functions then become 


(23) 1 P _qy 


(24) Ve = 


1973] THE QUATERNION CALCULUS 1007 


The resemblance of (23) to a conservation equation suggests the further substitution 
@ = — id to obtain from (23) and (24) 


(25) -~+V-p=0, 


(26) vi=—-- Viiv xy. 


We expect / to be real and to have real components. Equations (25) and (26) 
describe a variety of physical systems. If we identify 2 = ce and w = c?n where c 
is the speed of light in vacuo, e and nm are the relativistic energy and momentum 
densities, respectively, of the system under consideration, then, (25) and (26) break 
into the three equations 


de 2) 

a+ ¥ (c*n) = 0 

1 On e 

c Ot (;) ° 
Vxa« = 0. 


These three equations, the first of which is the conservation of mass-energy, constitute 
the basis of relativistic mechanics in the absence of electromagnetic forces, | 11, p. 272]. 


Defining the relativistic momentum quaternion P = — ie + cnx and the operator 
—i oO 
*=——+V 
L c Ot + 


we have the following quaternion expression for these equations: 
[\*P = 0. 
Thus, we have proved the following result. 


THEOREM 6.1. Jn the absence of electromagnetic forces, the momentum quater- 
nion, P, is a (formal) regular function of the quaternion variable ict + r. 


Maxwell’s equations 


V-H=0, V-E=o0 
c Ot 


are likewise expressed in the simple quaternion form 


1008 C. A. DEAVOURS 


(27) O*(E + iH) = —p+ + J. 


If @ and A are the usual Hertzian scalar and vector potentials for E and H, [12, p. 
212], the electromagnetic field is derivable from the quaternion potential function 
i@ + A through the equation 


(28) [)*(i¢ + A) = i(E + iH). 
From (27) and (28) we obtain 


1 
c? ot? 


(29) C*O*(id + A) = ( _— + A\(id + A). 


In component form (29) yield the equations relating the electromagnetic potential 
with the charge and current densities 


1 
[1*¢ = Pp, 7A = ad: 


where 1 2 


2 _ A LT 
= c2 Ot? 

Acknowledgments. The author wishes to thank John Castelluci for translating portions of the 

original German, and the referee for his many valuable suggestions and improvements in the paper. 


References 


1. R. Fueter, Die Funktionentheorie der Differentialgleichungen Au = 0 und AAu =0 mit 
vier rellen Variablem, Comment. Math. Helv., 7(1935) 307-30. 
2. H. Haefeli, Hyperkomplexe Differentiale, Comment. Math., Helv. 20 (1947) 382-420. 
3. A. Rose, On the use of a complex (quaternion) velocity potential in three dimensions, Com- 
ment. Math. Helv., 24 (1950) 135-48. 
4. R. Fueter, Uber vierfachperiodische Funktionen, Montsh. Math. Phys., 48 (1939) 161-69. 
5. E. Bell, Development of Mathematics, McGraw-Hill, New York, 1945 Chapter IX. 
6. M. Crowe, A History of Vector Analysis, Notre Dame Press, 1967. 
7. C. Curtis, MAA Studies in Modern Algebra, Vol. II, Math. Assoc. of America, 1963, 
pgs. 108-11. 
8. P. Duval, Homographies, Quaternions, and Rotations, Oxford Math. Monographs, Oxford 
Press, 1964. . 
9. W. Hamilton, Elements of Quaternions, Vol. I, Chapter II, 1860, reprinted by Chelsea, New 
York. 
10. R. Courant and D. Hilbert, Methods of Mathematical Physics, Vol. II, Interscience, Wiley, 
New York, 1965. 
11. A. Kyrala, Theoretical Physics, Sanders, Philadelphia, 1967. 
12. H. Phillips, Vector Analysis, Wiley, New York, 1963. 


INEQUALITIES FOR SUMS OF DISTANCES 


G. D. CHAKERIAN, University of California at Davis and 
M. S. KLAMKIN, Ford Motor Company 


1. Introduction. Any triangle inscribed in a circle of radius R and having the 
center as an interior point has perimeter greater than 4R. This theorem is discussed 
in [1], where the reader can find a number of references to papers dealing with 
generalizations and applications of the result. For example, it follows readily from 
this that any closed plane curve of length L can be covered by a circular disk of 
radius L/4. Assuming our circle is the unit circle centered at the origin and V,, V2, V3 
are the unit vectors representing the vertices of the triangle, the theorem becomes, 


(1) |V, -—V,| +] V2 — V3) +| Vi — V3) > 4. 


Thus, we see that inequality (1) is satisfied by any three unit vectors whose convex 
hull has the origin as an interior point. 

Our main purpose is to consider extensions of (1) to the case of more than three 
vectors and to higher dimensional spaces. Indeed, in Section 2, we shall give a simple 
proof of the following theorem for r unit vectors in Euclidean n-space E”. 


THEOREM I. If V,,V,,---,V, are unit vectors in E" and the origin is interior to 
their convex hull, then 


X|V,—V;| > 2(r — 1), 
where the sum is over 1S i<j Sr. 


The following corollary, which is a special case of Conjecture 4 raised in [1], is an 
immediate consequence. 


COROLLARY |. The total edge length of a simplex inscribed in a unit sphere 
in E”" with the center in its interior is greater than 2n. 


We shall see that Theorem 1 follows from an even stronger inequality, namely, 
(2) L|V,-V,|? > 4 - 0. 


In Section 3, we give a second proof of Theorem 1. Although not as simple as 
the proof in Section 2, the idea of this proof leads to other interesting results and 
new proofs of some theorems discussed in later sections. Finally, we discuss some 
questions about upper bounds in Section 7. 


2. Proof of Theorem 1. The proof depends on two known results. First we have 
the following easily established identity for r unit vectors in E”. 


(3) 2|V— Vj)? =? — | EY)’, 
where as usual the sum on the left hand side is over 1 S i < j < r while that on the 


1009 


1010 G. D. CHAKERIAN AND M. S. KLAMKIN [November 


right hand side is for 1 < i < r. This identity, in fact, gives a solution to a problem 
on the 1968 William Lowell Putnam Mathematical Competition (see [2] where a 
derivation of (3) is given). Next, we use an inequality proved in [3], that is, with the 
conditions given in Theorem 1, 


(4) |uV;| <r —2. 


For the sake of completeness and because of the relative inaccessibility of reference 
[3], we now give a proof of (4). 

Since the V,’s do not all lie in any half-space, the convex cone which is positively 
spanned by the vectors must be E”. For if it was not, there would then exist some 
hyperplane [4] such that the cone would lie entirely to one side of it. But this would 
violate the hypothesis. Consequently, there exists a set of numbers a,,---,a, where 
a, = 0,i = 1,---,r, La, = 1 such that 


aV,+--+aV, = 0. 
We now show that a, < 1/2. Assume the contrary that a; > 1/2. Then 
|a.V,+--+4,V,| =| —a,V,|. 


The r.h.s. > 1/2 and the lhs. S$ a, +-:- +a, = 1 —a,<1/2 which is a contra- 
diction. Finally, since 


V;, 2 hie V, = (1 — 2a,)V, +e +( — 2a,)V,, 
[Vite +4] < (1 -—2a,) +--+ —2a,) =r —-2. 


This result is best possible since one can get arbitrarily close to r — 2 by consid- 
ering a sequence of convex polytopes converging to the degenerate case of a segment. 

Inequality (2) is now immediate consequence of (4) in conjunction with (3). 
Since |V;—V,| S 2, we have |V;—V,|? <2|V,-—V,|, so Theorem 1 follows 
from (2). This completes the proof. 

The inequality sign in Theorem | can be replaced with equality if and only if 
one of the vectors is the negative of the remaining r — 1 vectors. This is easily deduced 
once one observes that in case of equality, we must have | V;—V,|? = 2|V,—V,| 
for 1 s<i<j <r, so|V,—V,| must always be 0 or 2. It should be noted that the 
inequality sign in (2) may be replaced by equality in other cases, e.g., when two 
vectors are antipodal and the remaining r — 2 coincide. 


3. Another proof of Theorem 1. This proof involves an application of a formula 
from integral geometry, expressing the length of a curve on the unit sphere in terms 
of the expected number of intersections with a random great circle. To state the 
higher dimensional version of this, let S"~’ be the unit sphere in E", n => 3, and 
for each uc S"~* let G(u) be the great (n — 2)-sphere orthogonal to u. In other 
words, G(u) is the intersection of S"~' with the (n —1)-dimensional subspace 


1973] INEQUALITIES FOR SUMS OF DISTANCES 1011 


orthogonal to u. Let du represent the normalized element of (n — 1)-dimensional 
rotation invariant surface measure on S"~', so that 


| du = 1. 
gn-i 


Now suppose C is a finite collection of arcs of great circles on S"~' of total length 
L(C). For each ue S"~*, let N(C,u) be the cardinality of G(u) OC. Then, 


(5) L(C) = «| N(C,u)du, 


where the integration is over all of S"~*. A proof of this formula, in case n = 3, is 
found in [5]. The formula is more generally valid for rectifiable arcs. For a dis- 
cussion of the proof in this case, at least for n = 3, see [6] and the references given 
therein. Because of its intrinsic interest we now sketch a proof of (5), using a method 
different from that used in [5]. 

If C,,--:,C, are arcs of great circles, and C = C, UC, U--- UC,, then 


k 
N(C,u) = X% N(C,,u), 
i=1 


except for an exceptional set of directions constituting a set of zero spherical surface 
measure. Since also 


k 
i=1 


it follows that it suffices to prove (5) in case C is a single arc of a great circle. But in 
this case, the right hand side of (5) is a function I(a) depending only on the length o 
of the arc C and not depending on its position. Moreover, J is an additive function 
in the sense that 


I(o, + 02) = I(o,) + [(o2), 


as can be seen by subdividing the arcC. Since J can be shown to be a continuous 
function of o, and since it is known that any additive continuous function I(a) must 
be of the form I(a) = Ao, for some constant A, we have that I is proportional to the 
arclength of C. To calculate the proportionality constant A, we compute I for a great 
semicircular arcC: 


At = I(x) = <| N(C, u)du = 7, 


since N(C,u) = 1 in this case, with a set of exceptions of zero measure. Formula (5) 


then follows. 
Coming back to the proof of Theorem 1, let P,QeS"~', we let o be the length 


1012 G. D. CHAKERIAN AND M. S. KLAMKIN [November 


of the shortest great circle arc joining them and d = | P — Q|. We have0 So S72, 


hence 
d . 6 2/(0 
—_— — — > —__f__ 
2 aa) — ~($), 


by Jordan’s inequality [6, p. 33], in other words, 


2 
(6) d= ao: 

Now suppose V,,---, V,¢S"~ '. For each i <j, set dj;=| V,-V; | and let o;; be the 
length of a shortest great circle arc joining V; to V;. Assuming the origin is interior 
to the convex hull of V,,-:-, V,, each hyperplane through the origin not containing 
any V; must separate some k of the points, 1 < k < r — 1, from the remaining r — k. 
Hence such a hyperplane crosses k(r — k) 2 r — 1 of the line segments determined 
by Vi, -::, V,. Let C be the collection of great circle arcs on S"~* obtained by projecting 
the set of line segments with endpoints among V,,-:-:,V, radially into the sphere. 
(In case a diameter is encountered, simply associate with it any semicircle with 
the same endpoints.) With N(C, u) defined as above, the preceding argument shows 
that 


(7) N(C,u)2r—-1, 


because G(u) intersects C in the same number of points that the hyperplane through 
the origin orthogonal to u intersects the set of line segments determined by Vj, ---, V,. 
This is true at least for those u such that G(u) contains no V,, a set of measure 1. 
From (6) we have 


(8) X|V;—V;| — Ldy = Loy 
and from (5) and (7), 
(9) Udi; = «| N(C, u)du = TU (r —_ 1). 


Theorem 1 now follows from (8) and (9). We must actually have strict inequality in (8) 
under the given hypothesis since not all o,; equal z. 

Note that (8) is applicable also in case n = 2. If r = 3, we have in this case 
XG;; = 2x, obtaining still another proof of (1). 


4. The smallest ball containing a space curve. The methods used in section 3 
provide another way to prove that any closed curve of length L in E” is contained in 
a ball of radius L/4. For other proofs of this, and related problems, see [8, 9| and 
references given therein. 

It is not difficult to see that it suffices to prove the following theorem. 


1973] INEQUALITIES FOR SUMS OF DISTANCES 1013 


THEOREM 2. Let K be a closed curve of length L(K) contained in the unit ball 
in E",. If K intersects S"~*, the boundary of the ball, in a set whose convex hull 
contains the origin, then L(K) 2 4. 


Proof. There exist a finite number r of points in K (S"~* whose convex hull 
contains the origin. We may label these points V,,---,V, in such a way that they 
lie cyclically on K, so that 


(10) L(K) = XV, — Viel, 


where the sum is over 1 < i Sr, and by convention V,,, = V,. Let C be the closed 
curve on S"~! obtained by projecting the polygon P with successive vertices 
Vi,°::, V, radially into the sphere (again with the convention that if V; —V;,, isa 
diameter, then we associate with it a semicircle with the same endpoints). Every 
hyperplane through the origin intersects the polygon P in at least two points; hence 
every great (n — 2)-sphere G(u) intersects C in at least two points. Thus, with the 


notation of Section 3, we have N(C,u) = 2, so from (5), 
(11) L(C) 2 2n. 


If we now apply (6), with d; = | V; — V;4,| and o; equal to the length of the radial 
projection of the segment V,V,,, into the sphere, we obtain 


(12) LK) = Ed,22E0,=21L0) 24, 


5. Generalization to points on a convex curve. A generalization of (1) in another 
direction is given in [1]. Let K be a plane convex curve containing the origin in its 
interior. Let V,, V,, V, be points on K whose convex hull has the origin as an interior 
point. Then it is shown in [1] that 


(13) |V; — V2| +| V2 — V3j +] Vi — V3| > 2m, 


where m is the length of the minimum chord of K passing through the origin. A simple 
inductive argument will give the following extension of (13). 


THEOREM 3. Let K be as above and let V,,-:-,V, be points on K whose convex 
hull contains the origin. Then if m is the minimum chord of K containing the 
origin, we have 

x|V;-—V;| = m(r — 1). 


Proof. The case r = 3 is proved in [1]. Assume the result known for r S k, 
and suppose V,,---,V,,, are k+1 points on K with the origin in the interior of 
their convex hull. The convex hull of some k of the points, say V;,---, V,, contains 
the origin, so 
(14) x |V,-V;| = m(k - 1). 


1<i<j<k 


1014 G. D. CHAKERIAN AND M. S. KLAMKIN [November 


Now some triangle with one vertex at V,,, and its other two vertices among 
V,, +++, V;, contains the origin. On the basis of the case r = 3 of our inequality, it is 
easy to show that the sum of the lengths of each pair of sides of this triangle is at 
least m. Thus | Viet — V,,| +|VY41- V, | =m for some a,Pe{l,---,k}, a # B. 
In conjunction with (14), this implies x | V; - V;| = mk, where the sum is over 
1si<j Sk +1. By induction, this completes the proof. 


REMARK. A close examination of the proof, keeping (13) in mind, shows that 
equality can hold in Theorem 3 only in the degenerate case when V,,---, V, lie ona 
minimum chord and one of the points is the negative of the other r — 1 points. 


6. Extensions to Minkowski planes. Let K be a centrally symmetric convex 
curve centered at the origin in E*. For any X € E? define the K-norm of X by 


(15) |X|x =|X|/r, 


where | X| is the ordinary Euclidean norm of X and r is the (Euclidean) length of 
the radius of K in the direction of X. Then | |x is a norm for a Minkowski plane, 
or a two dimensional Banach space, whose unit circle is K. 

Bearing a close relationship to the matters discussed in this article, there is a 
famous theorem of Fenchel stating that every closed curve has total curvature at 
least 2x (see [5], where a proof based on (5) is given). Indeed, the main step of the 
proof is embodied in our inequality (11). In the course of obtaining a version of 
Fenchel’s theorem valid for Minkowski spaces, Laugwitz [10] showed that if 
V,,V,,V3 are unit vectors in a Minkowski plane (i.e., | V;|x = 1, i = 1,2,3), with 
the origin interior to their convex hull, then 


(16) Vi -Valxt|Ve-Valet+|Yi -—Vs|x 24. 


We give another proof of (16) based on an idea from Section 3. This requires 
the following analogue of Jordan’s inequality for Minkowski planes. 


LEMMA. Let K be the unit circle in a Minkowski plane and P,QeK. Let 
d =|P—Q|x, the Minkowski length of the chord joining P to Q. Let o be the 
length of the smaller arc of K from P to Q, and let 4 be half the perimeter of K, 
both measured in the given Minkowski metric. Then, 


d = 2a/1. 


Proof. Let P’ and Q’ be the endpoints of the diameter of K parallel to the chord 
joining P to Q (with P’, P,Q, Q’ lying in that order along K). Let R be the point of 
intersection of the lines determined by PP’ and QQ’ respectively, assuming these 
lines are not parallel. Then R lies on the same side of the line through P’Q’ as P 
and Q, and the similarity transformation with R as fixed point and sending P to P’ 
and Q to Q’ also sends the arc o to a convex arc o’ with endpoints P’ and Q’. This 
transformation increases lengths by the factor 2/d, so a’ = 2a/d, where for con- 


1973] INEQUALITIES FOR SUMS OF DISTANCES 1015 


venience we use the same symbol to denote an arc and its length in the Minko wski 
metric. It is easy to see that the arc o’ lies inside the unit disk K. Using the fact that if 
K, is a closed convex curve inside a closed convex curve K,, then K, is not longer 
than K, (proved in a Minkowski plane in exactly the same way as in the Euclidean 
plane), one obtains that o’ has length at most half K. That is, 2a/d = o’ < 4, as we 
wanted to prove. If the lines through PP’ and QQ’ are parallel, the proof proceeds 
the same way, with the similarity transformation replaced by a translation. 

To prove (16), let d; = |V;— Visr|x, i = 1,2,3, with Vz = V,, and o; equal to 
the length of the shorter arc of K from V, to V,,,. By the lemma, we have d; 2 2a,/2, 
i = 1,2,3. Hence 

2 
X|V;- Vier|x = 14d,2 ad = 4, 
which proves the inequality (16). 

The same inductive argument used to establish Theorem 3 can be used in con- 

junction with (16) to prove the following generalization. 


THEOREM 4. Let V,,V2,---,V, be unit vectors whose convex hull contains the 
origin in a given Minkowski plane. Then 


2|Vi—Vi| x = Ar — 1). 


Inequality (16) asserts that any triangle inscribed in the unit circle K in a 
Minkowski plane and having the origin in its interior has perimeter at least 4. This 
can be trivially extended as follows: If P is an interior point of K, then any triangle 
inscribed in K with P in its interior has perimeter at least twice the minimum 
chord of K through P. To prove this, observe that if the triangle contains the origin, 
then the result follows from (16). If the triangle, with vertices V,, V,, V3, does not 
contain the origin in its interior or on its boundary, then one side, say that through 
V, and V,, separates P from the origin. Then the chord of K through P parallel to 
V,V, has length m<|V,-—V2|x, so |V,—Volx+|V2—Vsle+|Vs — Valx 
> 2|V, — V,|x 2 2m, from which the result follows. 

It is natural to conjecture that the analogue of (13), for general convex curves in 
a Minkowski plane, is valid (the argument above shows it for a Minkowski circle). 
In that case, we would also have that the analogue of Theorem 3 is true in any 
Minkowski plane. 

One might also conjecture that Theorem | holds in any Minkowski space. We are 
unable to prove this, although it is true that inequality (4) is valid in a Minkowski 
space. 


7. Upper bounds. If V,,V2,---,V, are any unit vectors in E”, then it follows 
from (3) that 


(17) B|V- VP <P, 


1016 G. D. CHAKERIAN AND M. S. KLAMKIN [November 


with equality if and only if XV, = 0. This, in fact, is the Putnam Examination 
problem (which was posed in case n = 3) mentioned in Section 2. We obtain from 
a straightforward application of the Cauchy inequality and (17), that 

r(r — 1) 


(18) Ey -¥1 <(“). 


Equality can hold in (18) only if the distances |V,—V,|, i<j, are all equal and 

XV, = 0. Since this condition cannot be satisfied if r > n+ 1 the inequality is not 
sharp in general. However, for r = n + 1, we obtain 

3)4 

EIy-¥,| 3 {AF 


with equality if and only if the V,; are the vertices of a regular simplex inscribed in 
S"~*. This is equivalent to the fact that of all simplices inscribed in the unit sphere, 
the regular simplex has maximum total edge length. 

It is proved in [11, p. 155] that if V,,---,V, are unit vectors in E? then 


Z| V;-V;| S roots, 
with equality only when the vectors are the vertices of a regular r-gon. It is also 
shown there that L|V,—V;|~* takes its minimum value in the case of a regular 
r-gon. What the corresponding higher dimensional results should be (for r > n + 1) 
appears to be unknown. Analogous upper bounds for such sums of distances between 
unit vectors in Minkowski spaces also appear to be unknown, even in the case of two 
dimensions. 


Note: The problem of obtaining an upper bound for the sum of all distances 
determined by n points on a unit ball has been studied more than Section 7 indicates; 
besides Fejes Toth, other workers include R. Alexander, G. Bjoérck, E. Hille, J. B. Kelly, 
F, Nielson, G. Pélya and G. Szegé, G. Sperling, K. B. Stolarsky, and H. S. Witsen- 
hausen. A systematic treatment together with generalizations and extensive biblio- 
graphy now exists, namely Extremal Problems of Distance Geometry Related to 
Energy Integrals by R. Alexander and K. B. Stolarsky (preprints available). For the 
unit sphere in the Euclidean space E” with m 2 5, the best results so far are in Sums 
of Distances Between Points on a Sphere II, K. B. Stolarsky, Proc. Amer. Math. Soc., 
to appear. For m = 5 the upper bound S(n) is shown to satisfy 


cyn? — cyn* < S(n) < cyn? — c3(e)n"~?/8 


for any ¢ > 0 (the left-hand inequality is due to R. Alexander). 


1973] THE WILLIAM LOWELL PUTNAM MATHEMATICAL COMPETITION 1017 


References 
1. G. D. Chakerian and M.S. Klamkin, Minimum triangles inscribed in a convex curve, Math. 
Mag., 46 (1973). 
2. William Lowell Putnam Mathematical Competition, this MONTHLY 76 (1969) 909-915. 
3. M.S. Klamkin and D. J. Newman, An inequality for the sum of unit vectors, Univ. Beo. 
Publ. Elek. Fac., Ser. Mat. i. Fiz., no. 338-352 (1971) 47-48. 
4. C. Davis, Theory of positive linear dependence, Amer. J. Math., 76 (1954) 733-746. 
5. S.S. Chern, Curves and surfaces in Euclidean space, Studies in Global Geometry and Analysis, 
Studies in Math., Vol. 4, MAA, 1967. 
6. H. T. Croft, A net to hold a sphere, J. London Math. Soc., 39 (1964) 1-4. 
7. D. S. Mitrinovi¢, Analytic Inequalities, Springer-Verlag, Berlin, 1970. 
8. G. D. Chakerian and M.S. Klamkin, Minimal covers for closed curves, Math. Mag., 46 (1973) 
55-61. 
9. J. E. Wetzel, Covering balls for curves of constant length, L’Enseignment Math., 17 (1971) 
275-277. 
10. D. Laugwitz, Konvexe Mittelpunktsbereiche und normierte Raéume, Math. Z., 61 (1954) 
235-244. 
11. L. Fejes Toth, Regular Figures, Macmillan, New York, 1964. 


THE WILLIAM LOWELL PUTNAM MATHEMATICAL COMPETITION 


J. H. McKAY, Oakland University 


The following results of the thirty-third William Lowell Putnam Mathematical 
Competition held on December 2, 1972 have been determined in accordance with 
the regulations governing the Competition. This competition is supported by the 
William Lowell Putnam Intercollegiate Memorial Fund left by Mrs. Putnam in 
memory of her husband and is held under the auspices of the Mathematical As- 
sociation of America. 

The first prize, five hundred dollars, is awarded to the Department of Mathematics 
of the California Institute of Technology, Pasadena, California. The members of 
the team were Bruce Reznick, Arthur Rubin, and Michael Yoder; to each of these a 
prize of one hundred dollars is awarded. 

The second prize, four hundred dollars, is awarded to the Department of Mathe- 
matics of Oberlin College, Oberlin, Ohio. The members of the team were Craig 
Huneke, James Paget, and Craig Seeley; to each of these a prize of seventy-five 
dollars is awarded. 

The third prize, three hundred dollars, is awarded to the Department of Mathe- 
matics of Harvard University, Cambridge, Massachusetts. The members of the team 
were David Harbater, David Jerison, and Seth Breidbart; to each of these a prize of 
fifty dollars is awarded. 

The fourth prize, two hundred dollars, is awarded to the Department of Mathe- 


1018 J. H. MCKAY. [November 


matics of Swarthmore College, Swarthmore, Pennsylvania. The members of the 
team were David Hough, David Shucker, and Kin On Tam; to each of these a prize 
of fifty dollars is awarded. 

The fifth prize, one hundred dollars, is awarded to the Department of Mathe- 
matics of the Massachusetts Institute of Technology, Cambridge, Massachusetts. 
The members of the team were David Christie, Joseph Mirzoeff, and Scott Brown; to 
each of these a prize of fifty dollars is awarded. 

The six persons ranking highest in the examination, named in alphabetical order, 
are Ira Gessel, Harvard University; Dean Hickerson, University of California at 
Davis; Arthur Rothstein, Reed College; Arthur Rubin, California Institute of 
Technology; David Vogan, University of Chicago; and Michael Yoder, California 
Institute of Technology. Each of these has been designated as a Putnam Fellow by 
the Mathematical Association of America and is awarded a prize of two hundred and 
fifty dollars. 


The next four highest ranking individuals, named in alphabetical order, are 
Seth Breidbart, Harvard University; Paul Lemke, Rensselaer Polytechnic Institute; 
Bruce Reznick, California Institute of Technology; and James Shearer, California 
Institute of Technology. To each of these a prize of one hundred dollars is awarded. 


The following teams, named in alphabetical order, won honorable mention: 
Princeton University, the members of the team were Angelos Tsirimokos, Ray White, 
and Loring Tu; Pomona College, the members of the team were Charles Grinstead, 
Richard Poppen, and Jerrold Griggs; Purdue University, the members of the team 
were Paul Garrett, Glenn Davis, and Paul Chew; Rice University, the members of 
the team were James Alexander, Gerald Georges, and Edwin Johnson; University 
of Toronto, the members of the team were William Franklin, Peter Debuda, and 
Robert Anderson. 


Honorable mention is given to the following thirty-one individuals, named in 
alphabetical order: Franklin Adams, University of Chicago; Kent Bailey, Oberlin 
College; Scott Brown, Massachusetts Institute of Technology; Martin Burger, 
Polytechnic Institute of Brooklyn; David Christie, Massachusetts Institute of 
Technology; Glenn Davis, Purdue University; David Dummitt, California Institute 
of Technology; Paul Farmwald, Purdue University; Robert Fisher, California 
Institute of Technology; Joseph Grcar, University of Minnesota; Alan Grenadir, 
Harvard University; Jerrold Griggs, Pomona College; David Hale, Rensselaer 
Polytechnic Institute; David Hough, Swarthmore College; Craig Huneke, Oberlin 
College; Glenn Iba, Massachusetts Institute of Technology; Paul Ilacqua, University 
of Santa Clara; Thomas Kucera, University of Manitoba; Mark Latham, University 
of British Columbia; David Levner, Cornell University; Charles Meeker, Michigan 
State University; Brian Mortimer, Carleton University; Peter Olver, Brown 
University; James Paget, Oberlin College; Robert Rumely, Grinnell College; 
Michael Somos, Case Western Reserve University; David Spear, City College of 


1973] THE WILLIAM LOWELL PUTNAM MATHEMATICAL COMPETITION 1019 


New York; David Ullrich, University of Wisconsin; Robert Weissler, Yale Univer- 
sity; Ray White, Princeton University; Chris Wright, Brown University. 


The other individuals who were ranked in the top one hundred, arranged by 
college, are: Maria Klawe, University of Alberta; John L. Spouge, University of 
British Columbia; Kim Schroeder, Bucknell University; Maury Bramson, Howard 
Landman, and Ross Millikan, University of California at Berkeley; Lawrence 
Gray, University of Claifornia at San Diego; Richard Niles, University of Cali- 
fornia at Santa Cruz; Thomas Howell, California Institute of Technology; 
Stewart Strait, California State University at San Diego; Michael Rennie, California 
State University at San Jose; Robert Bundy and Steven Kalikow, Case Western 
Reserve University; Thomas Branson and James McClure, University of Chicago; 
Charles Levermore, Clarkson College; Meir Shinnar and Jacob Sturm, Columbia 
University; Daniel Fisher and Jeffrey Hoffstein, Cornell University; David Garlock, 
David Jerison, Bruce Leverett, David Mostow, and Lyle Ramshaw, Harvard Univer- 
sity; John Boyd, Indiana University; Robert Beggs, Kent State University; John 
Bate, University of Manitoba; Dale Johannesen, University of Maryland; Leo 
Katzenstein and James Marlin, Massachusetts Institute of Technology; Dennis 
Stowe and David Catlin, University of Michigan; Steve Eicker and John Reiser, 
Michigan State University; Robert Bradley and Jean-Louis Richer, Université de 
Montréal; Albert Wigchert, University of Nevada at Reno; Eric Lofgren, New 
College; John Gilbert, University of New Mexico; Brian Hulse, New York University; 
Eli Isaacson, New York University, Washington Square; Craig Seeley, Oberlin 
College; Julius Collins, Polytechnic Institute of Brooklyn; Angelos Tsirimokos and 
Stephen Weber, Princeton University; Daniel Bump, Reed College; James 
Alexander, Edwin Johnson, and Joseph Robinett, Rice University; John Hyde and 
John Mellby, St. Olaf College; Jerome Eastham, Southwestern at Memphis; David 
Shucker, Swarthmore College; Robert Anderson and William Franklin, University 
of Toronto; Lanh Dinh Dang, University of Utah; William Parke, University of 
Washington; Jan Verster, University of Waterloo; Matthew Ginsberg, Wesleyan 
University; Robert Marheine and Robert Mortenson, University of Wisconsin at 
Madison; David Morandi, University of Wyoming. 

One thousand six hundred and eighty one students from three hundred and 
twenty two colleges and universities in the United States and Canada participated in 
the examination on December 2, 1972. 


The Questions Committee, consisting of Murray Klamkin (chairman), Nathan 
S. Mendelsohn, and Donald J. Newman, prepared the problems (listed below) for 
the competition. 


PROBLEMS. PART A 


A-1. Show that there are no four consecutive binomial coefficients (f), (11), G12), G13) 
(n, r integers > 0 and r + 3 S n) which are in arithmetic progression. 


1020 


B-2. 


B-3. 


B-4. 


J. H. MCKAY [November 

Let S be a set and let * be a binary operation on S satisfying the laws 

x*(x* y)=y for all x,y in S, 

(y *x)* x = y forall x,y in S. 

Show that * is commutative but not necessarily associative. 
If for a sequence x1, x2, x3, +++, lim,_, 4 (x1 +%2 + +++ + %,)/n exists, call this limit the C-limit 
of the sequence. A function f(x) from [0, 1] to the reals is called a supercontinuous function 
on the interval [0,1] if the C-limit exists for the sequence f(x1),f(x2), f(x3), --- whenever 
the C-limit exists for the sequence x,, x2, x3--:. Find all supercontinuous functions 
on [0, 1]. 
Of all ellipses inscribed in a square, show that the circle has the maximum perimeter. 
Show that if m is an integer greater than 1, then n does not divide 2” — 1. 
Let f(x) be an integrable function in0 <x <1 and suppose i} ; f(xdx = 0, fi x f(x)dx=0,-°°, 


f§2"-* f@ddx =0 and J! x"f(x)dx = 1. Show that |f(x)| 2 2" + 1) ina set of 
positive measure. 


PART B 
Show that the power series representation for the series x, 0 (x"(x—1)*")/n! cannot have 


three consecutive zero coefficients. 


A particle moving on a straight line starts from rest and attains a velocity vo after traversing 
a distance so. If the motion is such that the acceleration was never increasing, find the 
maximum time for the traverse. 


Let A and B be two elements in a group such that ABA = BAB, A3 = 1 and B2A-1=1 
for some positive integer n. Prove B = 1. 


Let n be an integer greater than 1. Show that there exists a polynomial P(x, y, z) with integral 
coefficients such that x = P(x”, x"+!, x + x%+2), 


If the opposite angles of a skew (non-planar) quadrilateral are equal in pairs, prove that the 
opposite sides are equal in pairs. 


Let my < m< n3< +++ < ny, beaset of positive integers. Prove that the polynomial 1 + z”1 
+ zf2 + .-- + 2% has no roots inside the circle \z| < (/5 — 1)/2. 


SOLUTIONS. PART A 


The number in parentheses, immediately following the problem number, is the number of participants 
who.received a score of 8, 9 or 10 (10 is maximum possible) on the problem. In the case of A-1 
A-2, B-1, and B-2, this applies to all 1681 participants. For the other problems, the count applies 
only to the 957 qualifiers. 


1973] THE WILLIAM LOWELL PUTNAM MATHEMATICAL COMPETITION 1021 


A-1 (129). For a given n and r, in order for the first three binomial coefficients 
to be in arithmetic progression, we must have 


(1) aC) +(. 2) 


or equivalently 


; r+1#n-—-r-1 
2 a= n—-r r+2 
The condition that the last three given binomial coefficients are in arithmetic 
progression is found from (1) by replacing r by r + 1. Consequently both r and r + 1 
must satisfy equation (2) if all four terms are in arithmetic progression. 

Note that the two terms in equation (2) are interchanged if r is replaced by 
n—r-—2. Thus the quadratic equation (2) has roots 


rrti;sn—r—3,n—r-2. 


Since (2) can have only two roots, r= n — r — 3 and n = 2r + 3. The four binomial 
coefficients must be 


2r + 3 2r + 3 2r + 3 2r+ 3 
r , r+1/, r+ 2/, r+3 
which are the four middle terms. They cannot be in arithmetic progression since 
binomial coefficients increase to the middle term(s) and then decrease. 


A-2 (167). Label the given laws (1) and (2), respectively. 
I. We first show that 


(3) (Xe y)¥x = yp. 


This follows from (x * y)*x = (xy) [(x*y)*y] = y. (First apply (2) with x and 
y interchanged; then apply (1) with x replaced by x x y.) 
We now obtain 


(4) yax=[(xey)ex]*x =x«y. 


(First apply (3); then apply (2) with y replaced by xy.) This proves that * is 
commutative. 
II. Let S be the set of all integers. Define x * y = — x — y. Then 


(5) xe(y*Z)=—X+ytZ; (x*¥y)*¥Z=Hx+y—-Z. 


It follows from (5) that, in the first place, (1) and (2) hold and, secondly, « fails to be 
associative: simply choose x # z in (5). 

Alternate Solution, Part I (suggested by Martin Davis): 

Write the equation x * y = z as P(x, y, z). Then law (1) may be written “‘Ifx* y = z 


1022 J. H. MCKAY [November 


then x*z = y’’ or 

(6) P(x, y,z) implies P(x, y, z). 
Similarly, the law (2) may be written 

(7) P(y, x, Zz) implies P(z, x, y). 


These two implications, (6) and (7), show that the permutations (23) and (13) on 
the location of the variables in P(x, y,z) are permitted. Since (13), (23) generate the 
symmetric group S;, we find (12) is also permitted. 

Thus, P(x, y,z) implies P(y,x,z) or x*y =z implies y*x =z, which means 
X*¥ Y= YRX. 


A-3 (6). A function is ‘“‘supercontinuous”’ if and only if it is affine, f(x) = Ax + B. 
The sufficiency is trivial (and was worth 1 point in the grading). For the necessity: 
First we note that it is not assumed that f(C-limit) = C-limit (f) (otherwise the 
solution could be materially simplified). The essential steps are to show, that if f is 
supercontinuous, then (1) f is continuous, and (2) f((a + b)/2) = (f(a) + f(b)) /2 for 
all a,b. These two statements imply that f is affine. The proofs of (1) and (2) are 
similar; we give (2) (which is the harder). Set c = (a + b)/2, and suppose f(c) # (f(a) 
+ f(b))/2. Imagine any sequence of integers N; which “‘grows very rapidly’’; say let 
N;+, exceed 2'N’. Then construct a sequence of points {x,} as follows: Break the 
sequence into blocks, alternating between 
and {x,} = a, b, a, b, a, b, ++ 

{x,} = €,C,€,C,C,C,***, 


the ab pattern holding for N,,;_,;Sn<N,,;, and the c pattern holding for 
N ; Sn < N>;,,. Then {x,} has the C-limit c, but the averages of {f(x,)} oscillate 
(because the lengths of the blocks N; Sn < N,4, increase very fast, and f(c) # the 
average of f(a) and f(b)). Thus the C-limit of { f(x,,)} does not exist, a contradiction. 

Comments: Many interesting classes of functions were proposed as the ‘‘answers”’ 
to this question. The most common choices were the class of all bounded functions 
and the continuous functions. (The correct choice ranked third in frequency.) 
Riemann and Lebesgue integrable functions were also mentioned. 

Professor David Cohoon has suggested the following problem: Is there any 
topology on the real line in terms of which the class of continuous functions coincides 
with the ‘‘supercontinuous’’ functions? (Here the functions are from R to R, and 
the same topology is to be put on R as image space and domain.) 


A-4 (1). Let the square of sidelength 2R have the vertices (+R /2, 0) and 
(0, + R./2). The ellipse 
2 


x7 y 
w) a? +b? ~ 


2 


1 


1973] THE WILLIAM LOWELL PUTNAM MATHEMATICAL COMPETITION 1023 


with 0S bSaS<R,/2 has the line x + y = R,/2 as a tangent if and only if the 
quadratic equation x? /a? + (R./ 2 — x)? /b? =1 has a double root. It can be verified 
that its discriminant vanishes if and only if a* + b? =2R?. As a varies from R to 
R J2 and b varies from R to 0, the curve (1) varies from the circle of radius R through 
all the non-circular ellipses inscribed in the square to the degenerate ‘‘flat’’ ellipse 
lying on the x-axis. 

Let 4L denote the length of the ellipse x = a cos t, y= b sin t,O St S 2x. Then 


n/{2 m/2 
L= { [a? sin?t + b? cos?t]*dt = i) [4a?(1 — cos 2t) + 4b7(1 + cos 21) ]*dt 
0 0 


n/2 
= { [R? — 4c? cos 2r]dt, 
8) 


where c” = a” — b?. The last integral we split into one from 0 to 7/4 and one from 
n/4 to x/2, and in the latter we substitute t = 2/2 — t’, obtaining 


n/4 
(2) L= [ {(R? — 4c? cos2t]? +[R? + 4c*cos2r]}* dt. 
0) 


Note that cos 2 > 0 forO0<t<17/4. 

Now the function f(u) = (p — u)* + (p + u)* decreases in the interval 0 < u S p, 
because 2f’(u) = —(p —u)-? + (p+ u)-? <0 for 0 <u < p. Thus the integral in 
(2) as a function of c has its largest value when c = 0, that is, for the inscribed circle. 

To show that an ellipse inscribed in the square must have its axes along the 
diagonals of the square, we choose the square as having sides u = +Randv=+R 
and the ellipse as having the equation 


Au? + Buv + Cv? + Du + Ev+F =0, 
where 
(1) 4AC — B? >0. 
Taking the “‘highest,’’ ‘‘lowest,’’ “‘rightest,’’ and “‘leftest’’ points on the ellipse, we 
see that all four sides of the square must be tangents to the ellipse. 


The line u=R is a tangent if and only if the equation Cv? + (BR+ E)v 
+ (Ar? + Dr + F) =0 has a double root or 


(2) (BR +E)? — 4C(AR? + DR + F) =0. 

The corresponding conditions for u = — R, v= R and v= — R are 
(3) (— BR + E)* — 4C(AR* — DR+ F)=0, 

(4) (BR + D)? — 4A(CR? + ER + F) = 0, 


(5) (— BR + D)? — 4A(CR? — ER + F) =0, 


1024 J. H. MCKAY [November 


respectively. Subtract (2) from (3) and divide by 4R; this gives 
(6) 2CD — BE=0. 
Similarly, from (4) and (5), 
(7) — BD + 2AE =0. 
By (6), (7) and (1), D = E = 0. Therefore (2) and (4) become 
B*R* — 4ACR?* — 4CF = 0, B?R* — 4ACR? — 4AF =0, 


respectively. Since F #40, we have A = C; this means that the ellipse has its axes 
along the lines u +v =0. 


A-5 (22). Assume that n divides 2” — 1 for some n > 1. Since 2” — 1 is odd, n is 
odd. Let p be the smallest prime factor of n. By Euler’s Theorem, 2°” = 1 (mod p), 
because p is odd. If A is the smallest positive integer such that 2* = 1 (mod p) then 4 
divides ¢(p) = p — 1. Consequently 4 has a smaller prime divisor than p. But 2” = 1 
(mod p) and so 4 also divides n. This means that n has a smaller prime divisor than p 
Contradiction. 


A-6 (9). The conditions imply {9 (x — 4)"f(x)dx = 1. Suppose | f(x)| < 2"(n + 1) 
except for a set of measure 0. 
Then 1 = fo (x — 4)"f(x)dx < 2"(n + 1) fol x—4 " dx = 1,a contradiction. 


SOLUTIONS. PART B 


B-1 (45). For the proposed solution the problem could have been stated in the 
more general form: The series expansion about any point for exp(P(x)), if P(x) is a 
cubic polynomial, will not have three consecutive zero coefficients. 


If f(x) = exp(P(x)), where P(x) is a cubic polynomial, then f’=f-P’ and 
f" =f':P’+f-P". In general for k = 2, 


(1) fu 1) =f). pry (i). P" 4 (5) pl". 


It follows from (1): if, at some (real or complex) point x9, f“~” (xo) =f" (x9) 
= f (x9) = 0, then also f“* (x9) = 0. By the same argument, f(x.) = 0 for 
u=k+2, k+3,---; so that f(x) would reduce to a polynomial. This is evidently 
impossible. 

Alternate Solution: In the given form of the problem it can be shown that no 
coefficient of x* is zero. The product x"(1 — x)?" has a non-zero coefficient for x" if 
0<k—nS2n or, equivalently, k/3 <n<k. This coefficient is the integer 


or): 


1973] THE WILLIAM LOWELL PUTNAM MATHEMATICAL COMPETITION 1025 


which we denote by a(n, k). The coefficient of x“ in the given series is 


k 
a(n, k 
C, = 2 anh) . 
n=[k/3]+1 n. 
Multiplying through this summation by (k — 1)! will convert each term, except the 
last term, to an integer. The last term becomes 1 /k. Since (k — 1)! times C, is not an 
integer for k > 1 and C, = Cy = 1, there are no zero coefficients in the expansion of 


the given series in powers of x. 


B-2 (152). We take vy as positive (see Comment) and consider the graph of v asa 
function of ¢ (see Figure 1). From the given data we know that the curve starts at the 
origin and is concave downward since the acceleration a = du /dt does not increase. 


v 


Uo 


A 
a 


Fic. 1 


Let ty be the time of the traverse. Then v(to) = vg. The distance s, is represented by the 
area bounded by the curve v = v(t), the t-axis, and the line t = tp. The area of the 
right triangle with vertices at (0,0), (¢9,0) and (t 9, v9) has area less than or equal to s. 
Thus 4 v9 to S So OFr 


< — 


lt . 
0 Vo 


Equality is possible and gives the maximum value of ft, (for given sy and vy) when the 
graph of v(t) is the straight line v(t) = (v9 /to)t = (vg /2s)t. 

Comment: If vo is zero or negative, there is no maximum time tf, for the traverse. 
In the case v9 = 0 the equation of motion 


S = So[3(t/to)? — 3(t/toe)? |, OStS to 
satisfies the conditions of the problem for any t, > 0. 
B-3 (123). From ABA = BA*B = BA~'B, we have 
AB? = ABA: A~!B = BA~'BA~!B = BA~!: ABA = B?A. 


1026 J. H. MCKAY [November 
By induction, AB*"= B?7"A so that AB = AB?" = B7"A=BA. Since A and B 
commute, ABA = BA’B implies A?B = A’B’, or B = B’, or B = 1. 


Alternate Solution: It can be shown that A and B commute by expressing each as 
powers of the same group element. Because A* = 1 it is tempting to multiply ABA 
= BA’B on the right by A? and then on the left by BA? to get B? = (BA?)?. Set 
X = BA? and use B?*" = B to obtain 


(1) B= x*", 
From X = BA’, we get XA=B, A= X~'B, or 
(2) A= xe", 
The conclusion that B = | is as before. 


B—4 (99). Letx=?", y=?t"t!,z=t+ t"t*. We construct a polynomial P(x, y, z) 
with integral coefficients such that P(x, y,z) =t. We have 


zZ =f + fh t?, 
zy — pnt2 + t2"t3, 
zy? — pants + pnts. 
gh—2 _— pe? ml yn? 


Multiply the above equations alternately by + 1 and — 1 and add: 
zZIL—y+y? oe t(— 1] att (— 17 Ht + (- 12". 
Hence, if we define 


n—-2 
P(x, y,x) =z | a (— "| + (— 1)"7'x", 
i=0 


Then P(t", t?t', t+ t+?) =t. 


B-5 (0). For the skew quadrilateral ABCD, let AB=a, BC=b, CD=c, 
DA =d, AC =x, BD = y. None of these lengths can be zero. By the law of cosines: 


a? +b?—x? ce? +d? —x? 
ab 7 cd 


or (ab — cd)x” = (be — ad) (ac — bd). Similarly, (ad — bc)y* = (cd — ab)(ac — bd). 
CASE 1: ab —cd = 0. 
Then, ad — bc =O and a=c, b=d. 
CasE 2: ab —cd #0. 


1973] THE WILLIAM LOWELL PUTNAM MATHEMATICAL COMPETITION 1027 


Then, be — ad #0, ac— bd 40 and x*y* =(ac — bd)”. Consequently, 
ac=xy+bd or bd=ac+xy. 


By Ptolemy’s Theorem (in space), ABCD must be concyclic which violates the skew 
condition. 

Alternate Solution: If AC = BC and AD = BD, the conclusion that AC = BD 
and BC = AD is obvious (see Figure 2) so assume AC # BC. With this assumption 
we first show BD = AC. If BD # AC there exists a unique point D* in the plane of 
AADB with BD* = AC, AD* =CB. /AD*B=/ACB=489. From A’s ADE and 
BD*E it follows that 7 DAE = / D*BE. 


D 


Fic. 2 


From the congruent A’s CD*A and CD*B it follows that 7 CAD* = / CBD*. 
These angle equalities prove that the trihedral angles A — CDD* and B — CDD* are 
congruent. Hence the angle which CA makes with the plane ADD* is equal to the angle 
CB makes with the plane BDD* (which is the plane ADD*). If H is the foot of the 
altitude from C to this plane the A’s CHA and CHB are congruent right triangles. 
That is AC = CB. This is a contradiction and BD = AC. 

Interchanging the roles of B and A in the above shows that AD = BC. 


B-6 (1). Let P(z) denote the given polynomial]. The power series expansion of 
1/(1 — z) —2 P(z) has coefficients + 1 with leading coefficient — 1. Hence, 


1 | 
(1) 1 + ~~ - 2P(2) <|z|+|zP +=) 7 
Also, 
1 | | 
| 2 P(z)| = i+ — |1 += -2P@)| 
1 | z| 1—|z|- z |? 


IV 


+ T¥]2]~ 1-|z 1—[z|? 


1028 


SIMPLE GROUPS 


The latter term is positive for |z| <(/ 5 — 1)/2. 


Acknowledgments 


The Director acknowledges, with appreciation, the assistance of Fritz Herzog, the Questions 
Committee, and the graders, especially L. M. Kelly, M. Hausner and J. I. Richards, in preparing the 
above solutions. The graders for the competition were: J. C. Chipman, C. V. Coffman, J. W. Dettman, 
R. A. DeVore, R. M. Dudley, W. R. Emerson, D. J. Eustice, R. A. Fontenot, J. Froemke, R. A 
Gambill, M. Hausner, A. P. Hillman, L. M. Kelly, B. B. Lieberman, D. G. Malm, E. A. Nordhaus, 
R. Pollack, J. I. Richards, D. Rosen, P. J. Sally, M. E. Shanks, H. A. Smith, K. E. Westerbeck, 


E. T. Wong. 


SIMPLE GROUPS* 


(Sung to the tune of “Sweet Betsy from Pike’’) 


What are the orders of all simple groups? 

I speak of the honest ones, not of the loops. 

It seems that old Burnside their orders has 
guessed 

Except for the cyclic ones, even the rest. 


Groups made up with permutes will produce 
some more: 

For A,, is simple, if 2 exceeds 4. 

Then, there was Sir Matthew who came into 
view 

Exhibiting groups of an order quite new. 


Still others have come on to study this thing. 

Of Artin and Chevalley now we shall sing. 

With matrices finite they made quite a list 

The question is: Could there be others they’ve 
missed ? 


Suzuki and Ree then maintained it’s the case 

That these methods had not reached the end of 
the chase. 

They wrote down some matrices, just four by 
four, 

That made up a simple group. Why not make 
more? 


And then came the opus of Thompson and Feit 
Which shed on the problem remarkable light. 


A group, when the order won’t factor by two 
Is cyclic or solvable. That’s what is true. 


Suzuki and Ree had caused eyebrows to raise, 

But the theoreticians they just couldn’t faze. 

Their groups were not new: if you added a twist, 

You could get them from old ones with a flick 
of the wrist. 


Still, some hardy souls felt a thorn in their side. 
For the five groups of Mathieu all reason defied; 
Not A,, not twisted, and not Chevalley, 

They called them sporadic and filed them away. 


Are Mathieu groups creatures of heaven or hell? 
Zvonimir Janko determined to tell. 

He found out that nobody wanted to know: 
The masters had missed 175 5 60. 


The floodgates were opened! New groups were 
the rage! 

(And twelve or more sprouted, to greet the new 
age.) 

By Janko and Conway and Fischer and Held 

McLaughlin, Suzuki, and Higman, and Sims. 


No doubt you noted the last lines don’t rhyme. 
Well, that is, quite simply, a sign of the time. 
There’s chaos, not order, among simple groups; 
And maybe we’d better go back to the loops. 


* Found scrawled on a library table in Eckhart Library at the U. of Chicago; author unknown, 
or in hiding. (See W. E. Mientka, Professor Leo Moser — Reflections of a Visit, American Mathe- 


matical Monthly 79 (1972), 609-614.) 


MATHEMATICAL NOTES 
EDITED BY ROBERT GILMER 


Material for this Department should be sent to David Roselle, Department of Mathematics, 
Louisiana State University, Baton Rouge, LA 70803. 


ON A PROBLEM CONCERNING EULER’S PHI-FUNCTION 
HAROLD DONNELLY, Berkeley, California 


Let @(N) denote Euler’s phi-function, thus @(N) is the number of positive in- 
tegers less than N which are relatively prime to N. Our attention is centered on 
whether there exists a number M such that @(N) = M has exactly one solution 
[cf. 2, p. 37, prob. 15]. 

While this problem is unsolved at present we obtain several results, including 
a lower bound on the size the integer N must have. 

The well-known formula (xpi) = mp3! ‘(p,—1) will be applied throughout. 
The table of prime numbers given by Baker and Gruenberger [1] was used to verify 
that certain numbers appearing in the proofs of Theorems 3 and 4 are primes as 
asserted. 


THEOREM 1. If N is the unique solution to ¢(N) = M for some M, then 
2°377°43* divides N. 

Proof. If 24 N, then ¢(2N) = @(N) and if 2|N but 274 N, then ¢(27'N) 
= G(N). Thus 27|N. If 34 N, then $(27'3N) = @(N); if 3| N and 374N, then 
o(2'3-'N) = ¢(N). Thus we have so far that 2°37|N. Next, if 7/.N, then 
o(2-*3-17N) = @(N); if 7| N and 774 N, then ¢(2'3'7-'N) = @(N), and therefore 
273°7?| N. Finally if 43 N, then $(2~*3~*77*43N) = @(N); if 43| N and 437) N, 
then $(2'317143-*N) = @(N), and so 273777437|N. 


Note. The proof clearly rests on the fact that each prime divisor is the product 
of previous prime divisors plus one, 1e., 3 =2+1, 7=2x3+4+1, 4B=2 x3 
x 7+1. This cannot be continued since the remaining products of known prime 
divisors plus one are 15=2x7+1, 87 =2x 4341, 259 =2x3x 4341, 
603 =2x7x434+1, 1807 =2x3x7x43+1 and these are all composite. 


THEOREM 2. If K is the smallest N such that for some M @(N) = M has exactly 
one solution, then 23 does not divide K. 


Proof. By Theorem 1, 27|K. Since K is the smallest number giving exactly 
one solution we know there is a J # 2~'K such that (J) = 6(27'K). If 2 | J, then 
o(2J) = o(K) implying J = 2~'K since K is a unique solution; therefore 2) J. 
Thus (4J) = 26(J) = 26(27'K) = (K) and we see that J = 4~'K. But if 2°| K, 


1029 


1030 HAROLD DONNELLY [November 


then $(47'K) = 47'@(K); but ¢(27'K) = 27'@(K) and these are not equal, 
therefore 2°) K. 


REMARK. By Theorem 1, it follows that the second power is the exact power of 
2 dividing K. 
We next consider the two possibilities for 3; i.e., 3°| K and 3° K. 


THEOREM 3. If K is the smallest number N such that for some M @(N) = M 
has exactly one solution and it should occur that 3° divides K, then (3)[(2)(3)(7) 
(19)(43)(127)(2287)(4903)(5419)(14,479)(98,299)(101,347)(304,039)(617,767)(688,087) 
(4,324,363) (26,563,939) (78,456,283) (86,714,839) |* divides K. 


Proof. From Theorem 1 and our hypotheses, 273°77437| K. 

If 19K, then $(187419K) = (K). If 19|K and 19? 7 K, then $(18119~!K) 
= ¢(K), and so 19?/K. Similarly if 1274 K, then $(12671127K) = ¢(K). If 
127|K and 12777K, then 9(126'127-'!K) = ¢(K); if 5419) K, then 
p(5418- 15419K) = $(K); if 5419| K but 541924 K, then $(5418'5419-!K) = O(K). 
Note that this process works because 19 = 2x 3*+1, 127 =2x3*x7+4+1; 
5419 =2x3x7-x 43+1 and because 19, 127, 5419 are prime (cf., [1]). 

We thus have so far that 2?377719743712775419? | K. 

We next note that 


2287 = 2 x 3* x 1274+ 1; 4903 =2x3x 19 x 43 +1; 
14,479 = 2x 3x 19 x 12741; 98,299 = 2 x 3* x 43 x 127 +1; 
101,347 =2x3x7x19 x 127+1; 304,039 = 2x 3*x 7x 19 x 12741; 
617,767 = 2x 3x 19 x 5419 +1; 688,087 = 2 x 37x 7 x 43 x 127 +1; 
4,324,363 =2x3x7x 19 x 5419 + 1; 26,563,939 = 2 x 3 x 19 x 43 x 5419 + 1; 
78,456,383 = 2 x 3 x 19 x 127 x 5419 + 1; 86,714,839 = 2x 3*x7 x 127x 5419+1; 


and that the numbers appearing on the left hand side of the equal signs are prime 
(cf. [1]). The theorem results by arguments entirely similar to those above. 


THEOREM 4. If K is the smallest number N such that for some M @(N) = M 
has exactly one solution and it should occur that 3°) K, then [(2)(3)(7) (13) (43) 
(79) (157) (547) (1093) (3613) (6709) (46,957) (303,493) (12,118,003) ]? divides K. 


Proof. 27377743? divides K by Theorem 1, 27 K by Theorem 2, and 3°} K 
by assumption. 

If 134K, then $(13'3671K) = g(K) so 13|K. If 13|K, 137K, then 
$(12413-1K) = f(K). So 13?| K. If 3613 K, then $(7~1367 '43~13613K) = o(K). 
f 3513 |K, 3613247 K, then $(3612!3613-!K) = o(K). So 2?37771324373613?| K. 


1973] MATHEMATICAL NOTES 1031 


This process works because 13 = 27 x 3+1; 3613 = 27x 3x 7x 4341 and be- 
cause 13, 3613 are prime (cf. [1]). 
Noting that 


79 =2x3x 1341; 157 =2?x3x 1341; 547 =2x3x7x 1341; 
1093 = 27 x3 x7x13+1; 6709 = 27x3x 13x 4341; 
46,957 = 27x3x7x 13x 4341; 303,493 = 27x3x7 x 3613 +1; 
12,118,003 = 2 x 3 x 13 x 43 x 3613 +1; 


and that the numbers appearing on the left hand side of the equations are prime 
(cf. [1]) we obtain the required result. 


THEOREM 5. A lower bound on a number N such that @(N) = M has exactly 
one solution is 10’’. 


Proof. Follows from Theorems 3 and 4. 


References 


1. C. L. Baker, and F. J. Gruenberger, The first Six Million Prime Numbers, Santa Monica, 
The Rand Corporation, 1957. 

2. Ivan Niven and H. S. Zuckerman, An Introduction to the Theory of Numbers, Wiley, New 
York, 1964. ° 


WHAT IS THE PROBABILITY THAT TWO GROUP ELEMENTS COMMUTE? 
W. H. GusTAFSON, Indiana University 


1. Introduction. A student studying both probability and algebra might well ask 
the question posed in the title. One can solve the problem for finite groups by a 
straightforward attack as follows: 

Let G be a group of finite order n. The probability Pr(G) that two elements 
selected at random (with replacement) from G are commutative is | C|/n?, where 
C = {(x,y)eG x G| xy = yx}. In order to count the elements of C, we observe 
that for each x EG, the number of elements of C of the form (x, y) is | C,,| , where C,, 
is the centralizer of x in G. Hence we have 


[C|= Z¢,I, 
where the sum extends over all xe G. Now we recall that if x and y are conjugate 
elements of G, then C, and C, are conjugate subgroups. Further, the number of 


elements in the conjugacy class of x is [G: C,|. Hence, if x,,...,x, are representatives 
of the conjugacy classes in G, we have 


Ic] = ZfG:¢,]-|¢,,| = kon. 


1032 W. H. GUSTAFSON [November 


Thus Pr(G) = k/n, the number of classes in G divided by the order of G. This tech- 
nique was used by Erdés and Turan [4]. 

Now let us observe that 5/8 is an upper bound. for Pr(G) when G is nonabelian. 
For the “‘class equation”’ tells us that 


|G] =|Z| +] K,|+---+|K,], . 


where Z is the center of G, and K,,...,K, are the nontrivial conjugacy classes. We 
have | K;| = 2 for i=1,...,t, whence (| G| — |Z|)/2 =t. Thus k=t+ |Z |< 
(| G| + | Z|)/2. As G is nonabelian, G/Z is not cyclic (see Scott [7, p. 50]) and hence 
|Z| <|G[/4. Thus k < 5/8-|G|, so Pr(G) S 5/8. The reader may verify that this 
bound is sharp, by examining the nonabelian groups of order eight. 


2. Compact groups. The reader may now wonder whether the above analysis 
Carries over in any sense to infinite groups. Of course, the ratio k/n is no longer 
meaningful, but we shall see that there is an analogue of the bound 5/8 for a class 
of topological groups. 

Let G be a compact, Hausdorff topological group. We recall that G has a left 
Haar measure; that is, a Borel measure yw such that w(U)>0 for each nonempty 
open set U of G, and u(x: E) = u(E) for each Borel set E of G and each xeG. 
Further, w is unique once we impose the normalization condition p(G) = 1. The 
reader who is not familiar with Haar measure may consult [5, Chapter XI]. On the 
product space Gx G, we impose the product measure uwxyp. Again let 
C = {(x,y)EG x G| xy = yx}. We remark that C = f-‘(1), where f:G x G>G 
is the continuous function given by f(x, y) = xyx~'y—'. It follows that C is closed, 
and hence measurable. We view mu x uw as a probability measure; then Pr(G) = 
pt x WC). Let us now prove our generalization of the last result of Section 1: 


THEOREM. Let G be a compact nonabelian group. Then Pr(G) < 5/8. 


Proof. Let y: G x G — reals, be the characteristic function of C. Then we have 


wx wc) = | 


G Xx 


xd(u x p). 
G 
By Fubini’s theorem [5, p. 148], we have 


ux UC) = [ [xs vddutyrduc. 


Also, {¢x(x,y)du(y) = u(C,) for each x, where again C,, is the centralizer of x in G. 
We recall once more that [G: Z] = 4. As G is the disjoint union of the cosets of Z, 
it follows that u(Z) < 1/4. (Note that Z is closed and hence measurable.) Now we 
notice that if xe Z, then C, = Gand so u(C,) = 1; on the other hand, ifxeG—Z, 
then C,, has index at least 2 in G, whence u(C,) < 1/2. Therefore we have 


1973] MATHEMATICAL NOTES 1033 


Pr(G) 


ux u(C) = [ u(C,)d(x) 


= f acai) + [ _wCs)dutx) 


IIA 


w(Z)- 1 + w(G—Z)- 1/2 = WZ) + 1/2 — WZ)/2 = 5/8. 


3. Further remarks. Let us now return to the case of finite groups. Here the 
formula Pr(G) = k/n may be used to good advantage in calculating bounds on Pr(G) 
for special classes of groups. For example, the reader may use the class formula 
to show that for nonabelian p-groups G, Pr(G) S (p? + p — 1)/p*. Some information 
may also be gathered from the theory of group characters [2]. One makes use of 
the fact that the number of irreducible complex characters of G is just k, together 
with the fact that |G] = [G:G’]+n{+---+n;, where G’ is the commutator 
subgroup of G, and n,,---,n, are the degrees of the nonlinear irreducible characters. 

Here are some problems for the reader to try: 


(i) Pr(G X H)= Pr(G)- Pr(A#). 

(ii) If Pr(G)= 5/8, then G is nilpotent. 

(iii) If G is finite and Pr(G)= 5/8, then G is the direct product of an abelian group and a 2- 
group H such that |H| = 8, H is directly indecomposable and Pr(H)= 5/8. 

(iv) Characterize the groups H having the properties in (iii). (See Miller [6], where the groups 
with [G:Z]= 4 are classified.) 

(v) Derive the bound Pr(G) < 5/8 for finite groups by use of the facts from character theory 
given above. 

(vi) If G is simple and nonabelian, then Pr(G) < 1/12, with equality for the alternating group 
on five letters. (This problem was first posed by J. Dixon.) 

(vii) Study the probabilistic properties of finite groups in general. Some starting points might 
be Erd6s and Turan [4] and Dixon [3]. Of particular interest is a conjecture of Dixon: The probabi- 
lity that two elements chosen at random from a finite simple group G generate G tends uniformly 
to one as the order of G tends to infinity. 


Finally, we would like to encourage the study of a more difficult problem: find 
lower bounds for Pr(G). While it is easy to see that no universal lower bound exists, 
Erdés and Turan [4] have shown that Pr(G) = (log, log, | G|)/ | G| . C. Ayoub [1] 
has developed some lower bounds for p-groups of small order. 


I am pleased to acknowledge useful conversations with M. Zorn, P. Halmos, and W. Moran. 
I am also grateful to R. MacKenzie who read the original manuscript. 


References 


1. C. Ayoub, On the number of conjugate classes in a group, Proc. Internat. Conf. Theory of 
Groups, Gordon and Breach, New York, 1967, pp. 7-10. 

2. C. Curtis and I. Reiner, Representation Theory of Finite Groups and Associative Algebras, 
Interscience, New York, 1962. 


1034 C. W. BARNES [November 


3. J. Dixon, The probability of generating the symmetric group, Math. Z., 110( 1969) 199-205, 

4, P. Erdés and P. Turan, On some problems of a statistical group-theory, IV, Acta Math. Acad. 
Sci. Hung., 19 (1968) 413-435. 

5. P. Halmos, Measure Theory, Van Nostrand, Princeton, N. J. 1950. 

6. G.A. Miller, Relative number of non-invariant operators in a group, Proc. Nat. Acad. Sci. 
USA, 30 (1944) 25-28. 

7. W. Scott, Group Theory, Prentice-Hall, Englewood Cliffs, N. J. 1964. 


REMARKS ON THE BESSEL POLYNOMIALS 
C. W. Barnes, University of Mississippi 


1. Introduction. The Bessel polynomial y,(x) is defined by Krall and Frink [4] 
to be the polynomial of degree n, with constant term equal to unity, which satisfies 
the differential equation 


(1) x7y" + 2(x + 1)y’ — n(n + 1)y = 0. 


Krall and Frink discussed the Bessel polynomials from the standpoint of recur- 
rence relations, orthogonality, generating functions, and related matters. Their 
algebraic properties were considered by Grosswald [2]. 

In the present note we establish a new result concerning the zeros of the Bessel 
polynomials. Using a test of Wall [7], which the Bessel polynomials fit in a very 
natural way, we prove that their zeros have negative real parts. We also give a new 
proof of a theorem of Dickinson [1], section 6, that the origin is a limit point of 
zeros of the Bessel polynomials. Our .proof of Dickinson’s theorem is somewhat 
simpler than that given in [1] inasmuch as it depends mainly on an application of 
the maximum modulus principle for analytic functions. 

Finally we comment on the history of the Bessel polynomials, and relate them 
to work of Olds [5] based on Hermite [3]. 


2. The zeros of the Bessel polynomials. The differential equation (1) is satisfied 
by 


~ (n+k)! x ) 
2 = Me fa) 
2) YO) = 2 Ore (; 
These polynomials satisfy the recurrence relations 
(3) Vnt 1(x) = (2n + 1)xy,(x) + Yn- 1(X) ) 


where y,(x) = 1, yx(x%) = 14x. 
Krall and Frink [4] showed that 


(4) X*YA(x) = (nx — 1)y,(X) + Ya-1(%) 


1973] MATHEMATICAL NOTES 1035 


and 
(5) x7 yn—1(X) = Vax) — (nx + Dyy— 1%). 
Next we require a 

LEMMA (Wall [7]). Let P(x) = x" + ayx"-! + a,x"-2 +++» +a, be a polynomial 
with real coefficients, and let Q(x) = a,x"~' + a3x"-7 + a,x"-> +++ be the poly- 
nomial obtained from P(x) by dropping out the first, third, fifth,--- terms. Then 


all zeros of P(x) have negative real parts if and only if the rational function 
O(x)/P(x) has a continued fraction expansion of the form 


1 


a ar an ES 


cx +1 4+ —————___ 


CoX + 
C3X + 


where the coefficients c,,C2,°*:,c, are all positive. 


Hence to test a polynomial P(x), we apply the Euclidean algorithm for the 
greatest common divisor to the polynomials P(x) and Q(x). In the event the sequence 
of quotients has the form c,x + 1, c.x,--:,c,x, where each of c,,c,,°°°,C, iS positive, 
then the zeros of P(x) have negative real parts. 


THEOREM 1. The zeros of the Bessel polynomials have negative real parts. 
Proof. From (3) we obtain 
(6) Yat1(—X) = (2n + 1)(—X)y,(—X) + Ya-1(—). 


Let O,(x) be the polynomials of degree n—1 obtained from y,(x) by dropping out 
the first, third, fifth,... terms. Then 


(7) 0,(x) = 44 y,(x) + (— 1)’**y,(—x)} : 
We can show that 
(8) O,+1(x) = (2n + 1)xQ,(x) + Q,-1(%). 


For we note that Q0,(x) = 1, Q.(x) = 3x, and (8) holds when n = 2. We make 
the inductive hypothesis that (8) holds for all integers m,2 < m < n. Now consider 


(2n + 1)xQ,(x) + Q,-1(X) 


1036 C. W. BARNES [November 
(2n + 1)x{4(y,(x) + (— 1)" *y,(—>))} + ${¥n- 1) + (—D" yn 100} 

= 4((2n + 1)xy,(X)+ Ya-1)} + Hn + Yx(—1)"**y,(—x) + (-1)"y,-1(—x)} 
4Yn+1(X) + 3{Qn + 1)x(—1)(-D"y,(—x) + (-D)"y,-1(—»)} 

= 4yn41(%) + (- Dn + 1)(—)y,(—X) + Ya-1(—x)} 

= WHVne1%) +(—1)"* 7p (—¥)F = Qra i), 


where we used (6) and (7). 
Next we show that 


(9) 


+ (2n — 1)x° 


We use induction and the recurrence relation (8). Since y,(x)/O,(x) =1+x, we 
do indeed have a basis for induction. Thus we suppose that (9) is valid for all integers 
2,3,°°°,n, and consider the continued fraction 


1+x + — 


3x + _ 


5x + 


1 
+ (2n + 1)x © 


Suppose that R(x) and S(x) denote the numerator and denominator of the con- 
tinued fraction. Then by the standard recurrence relations for the numerators and 
denominators of the approximants to a continued fraction we have 


R(x) = (2n + 1)xy,(x) + Ya-1(%), 
and 


S(x) = (2n + 1)xQ,(x) + Qn-1), 


since by the inductive hypothesis, y,(x) is the numerator and Q,(x) is the denomi- 
nator of the continued fraction 


1973] MATHEMATICAL NOTES 1037 


Lox $ — 
3x + 


1 
+ (2n — 1)x’ 


and y,— (x) is the numerator and Q,,_ ,(x) is the denominator of the last approximant 
to 


1_+xt+ 


1 


+ (2n — 3)x’ 


Hence from (3) and (8) it follows that 


R(x) — Yns1(%) and S(x) = On+ s(x). 


Thus-the expansion (9) is valid for every positive integer n. Theorem 1 now fol- 
lows directly from the lemma. 

Grosswald [2] proved that the zeros of the Bessel polynomials are simple; all 
zeros are inside the unit circle except for the zero of y,(x) which is on the unit circle. 
Theorem 1 improves this last result. It is also established in [2] that for n even, 
y,(x) has no real zeros and that for n odd, y,(x) has a single real, negative zero; 
finally, if x, is the real zero of y,(x) for odd n, we have 


—1 =X, <xX3<°9: < XxX, <x, <9: <0. 


We can now show that there is an x, arbitrarily near the origin. Thus we have 
(see [1], page 954): 


THEOREM 2. The origin is a limit point of the zeros of the Bessel polynomials. 


Proof. Suppose there is a positive number c <1 such that the circle | z | =C 
contains no zero of the Bessel polynomials. By a classic theorem of Cauchy we have 


J Vl2) = 0 for n = 1,2,3,-:-. 
2ni |z| =c Yn(Z) 


Hence by (4) we have 


mat, Ws) a, 
2ni I, =c ( Zz? r z*y,(Z) dz = 0. 


1038 C. W. BARNES [November 


Thus 


1 Yn-1(Z) , 1 nz—-1, | 
2ni [... z*y,(Z) dz = Oni i z? dz = —n. 


Therefore 


_ I Yn -1(2) 
on 2ni |. =¢ 27 y,(Z) az. 


and, by the fundamental estimate on an integral, we have 


n < 1 max Yn=1@) | . 
|[z| =c yAZ) 
or 
| Yn-1(Z) 
(10) cS — max |— ; 
n Jz] =e y,(Z) 


Using (5) we have {yj 4(z)/Yn—1(z)} = {Ya(z)/z?¥_n—1(z)} — {(nz + 1)/z?}. There- 
fore, since we shall have 


i Yn-1(2Z) 
— Jn-1"" dz = 0 
2ni |z] =c Yn—1(Z) 


9 


it follows that 


I 1 1 
AZ) ay | zt teen. 
Jz| =e 


2ni |z] =c Zz? y,—1(2) 7 2ni z? 


As before, we obtain n < 1/c max).)=.| Va(Z)/ y,-1(z)| , or 


| Vaz) | 


1 
11 cs -— max —|, 
( ) 4 V,-1€2) | 


Hl |z| =e 


Considering the estimates (10) and (11), it now follows by the principle of the 
maximum and minimum modulus that 


| Yn - 1(Z) 
¥AZ) | 


c. In particular, when z = c we have 


IIA 


1 
Cc _— 
n 


for each point z such that | z| 
1 
n 


Yn - 1(C) i 
ye) | n 


IA 


3 


since for c>0, y,-,(c)< y,(c). Therefore c<1/n for every positive integer n. 
This implies c = 0; hence there can be no circle about the origin which is free of 
zeros of y,(x), and Theorem 2 is established. 


1973] MATHEMATICAL NOTES 1039 


3. Conclusion. Grosswald [2| comments in section 2 on the property of a Bessel 
polynomial to approximate an exponential and remarks earlier that the polynomials 
A,(x) defined by 


(—1)""*A, (0) = (1 — DJ" "x" = x"y,(2/x), 


when D is the symbol of derivation, have been used 1n the proofs of the transcendence 
of e. These polynomials A,(x) are discussed in Siegel [6]. 

In what follows we can be more specific about the connection between an ex- 
ponential and the Bessel polynomials. This relation also puts into evidence the 
polynomials Q,(x) of Wall’s test fraction [7]. 

Olds [5] gave a development of the simple continued fraction for e based on 
ideas in Hermite’s paper [3] in which the transcendence of e was established. Olds 
obtained two sequences of polynomials {T,(x)} and {Z,(x)} such that 


T(x) = n—1)xT,- (x) + Th-2), 
ZAX) = (2n—1)xZ,_ (x) + Z,—2(x). 
In particular, 
To(x) = 1, T(x) = x, T(x) = 3x? +1, T3(x) = 15x? + 6x, 
Zj(x) = 1, Z,(x) = 1, Z,(x) = 3x, Z3(x) = 15x? +1. 


Hence for n = 1,2,3,--- it is immediate that Z,(x) = Q,(x) in the notation of sec- 
tion 2, and T,(x) + Z,(x) = y,(x), the Bessel polynomial of degree n. 

As a consequence of the fundamental recurrence relations for the numerator 
and denominators of the approximants of a continued fraction, it follows that the 
rational functions T,(x)/Z,(x), n = 1,2,3,--- are the approximants to the simple 
continued fraction 


Olds verified that 


je * +1 T(x) 
im |——— — = 0, 
n> e2/* — I ZX) 


and hence that 


1040 C. W. BARNES [November 


Thus by (9) we have 


THEOREM 3. For n = 1,2,3,--- the sequence of rational functions {y,(x)/Q,(x)}, 
x % 0, is the sequence of approximants to the continued fraction 
1 
1+x + ————__— 
3x + — 
SX + 


1 


+ (2n — 1)x + 


that is, to the continued fraction for 2e7'*/(e?* — 1). 


By the classic theory concerning the approximants to a continued fraction, it 
is now evident why the Bessel polynomials provide good approximations to an 
exponential. 


References 


1. David Dickinson, On Lommel and Bessel Polynomials, Proc. Amer. Math. Soc., 5 (1954) 
946-956 

2. Emil Grosswald, On Some Algebraic Properties of the Bessel Polynomials, Trans. Amer. 
Math. Soc., 71 (1951) 197-210. 

3. C. Hermite, Compt. Rend. Acad. Sci. Paris, 77 (1873) 18-24, 74-79, 285-293. 

4. H.L. Krall and Orrin Frink, A New Class of Orthogonal Polynomials: the Bessel Polynomials, 
Trans. Amer. Math. Soc., 65 (1949) 100-115. 

5. C. D. Olds, The simple continued fraction expansion of e, this MONTHLY, 77 (1970) 968-974. 

6. Carl Ludwig Siegel, Transcendental Numbers, Annals of Mathematics Studies, Number 
16. Princeton University Press, 1949. 


7. H.S. Wall, Polynomials whose zeros have negative real parts, this MONTHLY, 52 (1945) 308— 
322. 


1973] MATHEMATICAL NOTES 1041 
A MICRONOTE ON A FUNCTIONAL EQUATION 
H. N. SHapiIRO, Courant Institute 


1. Introduction. It is well known that the only locally integrable (i.e., integrable 
over every finite interval) solution of the functional equation 


(1.1) f(x + y) =f) +f) 


is of the form f(x) = cx, c a constant. In this note we propose to give a very short 
roof of this. 


2. The proof. On the basis of (1.1) and the hypothesis of local integrability 
one easily verifies the identity 


(2.1) yf(x) = { flu)du — [ fu)du { flu)du. 


Since the right side of (2.1) is invariant under the interchange of x and y, it follows 
that xf(y) = yf(x). Thus for x 4 0, f(x)x~' = c¢ a constant, or f(x) = cx. Since 
(1.1) implies f(0) = 0 this also holds for x = 0. 


3. Concluding remarks. It is known that the above result remains valid under 
the weaker hypothesis that f(x) is measurable [1]. Note also that (2.1) asserts that 
yf(x) is a coboundary. 


Reference 


1. W. Sierpinski, Sur l’équation fonctionnelle f(x + y)= f(x) + f(y), Fundamenta Math., 
1 (1920) 116~-122. 


AN ADDENDUM TO THE PAPER 
“A CHARACTERIZATION OF THE 7 X n MATRICES 
OVER A FINITE FIELD” 


J. V. BRAWLEY, Clemson University and L. CARLITZ, Duke University 


In [1], the authors showed that (except for two trivial cases) a ring R has the 
property that every function f from R to R can be represented by a generalized poly- 
nomial [see 1] if and only if for some n and some finite F, R is the ring of n x n matrices 
over F’. In the present note we obtain an extension of that result. It will be seen that 
the ideas used here are only slight generalizations of those used in [1]. 


DEFINITION. Let R be a ring. A polynomial in m variables x,,x.2,---,x,, over R 


is a finite sum of terms of the form 


Agx™ Ay"? Ay 4Z""Ay, , 


1042 J. V. BRAWLEY AND L. CARLITZ [November 


where x, y,-:-,Z can be any of x,,-::,x,,, where ¢ and e’ are in {0,1}, and where 
k = 0 and the n, = 1 are integers. 

It is clear that any such polynomial f(x,,---,x,) defines via substitution a func- 
tion f from R™ to R in which case the polynomial is said to represent the function f/f. 


THEOREM. Let R 4 (0) be a ring and let m = 2 be an integer. Every function 
from R™ to R, is representable by a polynomial in m variables iff R = (F), for 
some n and some finite field F. 


Proof. Let P denote all functions from R” to R which are representable by poly- 
nomials, and assume P = R*”. R is finite by an argument similar to Lemma 1 
of [1]. Also, R has only trivial ideals. To see this, assume 0c I c R is proper. 
Select ae I — {0} and be R—T, let p:R” —R be the function 


bsry =a,r, =r3 =" =P 
PUT 1, °°*s Tm) = . 
0; otherwise, 
and let p(x,,°::,X,,) be the polynomial which represents p. Also, let p(x;,---, x,) 
be the polynomial over R/J obtained by replacing any coefficient a in p(x,,---, X,) 
by d=a+lTI. Then / defines a map from (R/I)” to (R/I) and moreover, 


P(F 4, Fo, oa) Fin) = P1151 25°*'s Tm): 


Thus, f(4,0,0,---,0) = p(0,0, ---,6) implies p(a,0,---,0) = p(0,0,---,0) or 5 = 5, a 
contradiction. Thus, R is a finite simple ring or else R* = (0). To show the latter 
situation is impossible, assume R* = (0). Then (R, +) is a finite simple group and 
thus has prime order p. Since R* = (0), only polynomials of the form 


E940 + &4X, ft oee $f Emm > 


where ¢,€{0,1,---,p—1} need be considered as representing functions in R®". 
Hence 


| P| — p** _ |Re"| _ pr” 


which is only valid when p = 2 and m = 1, a contradiction to the fact that m 2 2. 
Thus, R is simple so that R = (F), for some n and some finite field F. 

Conversely assume R = (F),, and let f: R” > R. By Theorem 1 of [1], there 
exists a polynomial in one variable p(x) which represents the function 


1; r=0 
p(r) = 
0; rH. 
The m-variable polynomial function 


yet 2 f(115 00s Vm) P(%1 — 11) P(%2 — 12) °° PXm— Tn) 


tmeR rreR. 


1973] RESEARCH PROBLEMS 1043 


is easily seen to represent f. 


Reference 


1. J. V. Brawley and L. Carlitz, A characterization of the n <x n matrices over a finite field, this 
MONTHLY, 80 (1973) 670-672. 


RESEARCH PROBLEMS 


EDITED BY RICHARD GUY 


In this Department the Monthly presents easily stated research problems dealing with notions 
ordinarily encountered in undergraduate mathematics. Each problem should be accompanied by 
relevant references (if any are known to the author) and by a brief description of known partial 
results. Manuscripts should be sent to Richard Guy, Department of Mathematics, Statistics, and 
Computing Science, The University of Calgary, Calgary 44, Alberta, Canada. 


EXPLORING A PLANET 
L. Feses TOTH, Hungarian Academy of Sciences, Budapest 


The surface of a planet can be explored by setting up on the planet a certain 
number n of bases, serving as starting-points for various operations. The problem 
of the most economical distribution of the bases may be formulated as follows: 
How should n bases (points) be distributed on a sphere so as to minimize the greatest 
distance between a point of the sphere and the base nearest to it? The only values 
of n for which the solution of this problem is known are n = 2,3,4,5,6,7, 10, 12 
and 14 [1,2,3]; for arbitrary values of n there is not even a reasonable conjecture 
concerning the solution. 

Another possibility to explore the planet is to make a set of photographs from 
a certain number n of satellites guided around the planet. Supposing that the planet 
hovers in space without rotation and the satellites orbit the planet at the same con- 
stant altitude, the problem of the most economical choice of the paths of the satellites 
is as follows: How should n great circles be distributed on a sphere so as to minimize 
the greatest distance between a point of the sphere and the great circle nearest to it? 

In contrast to the previous problem, there is a chance to solve this problem 
in its full generality. Indeed, it may be conjectured that the solution is given by n 
great circles all passing through two antipodal points so as to divide the sphere 
in 2n equal digons. This is trivial for n = 2 and it has been proved by Miss V. Rosta 
[4] for n =3. 

The. problem is equivalent with its following dual counterpart: How should n 
points be distributed on a sphere so as to minimize the greatest distance between 
a great circle and the point nearest to it? The dual counterpart of the above conjec- 


1044 L. FEJES TOTH [November 


ture says that the points together with their antipodes must be the vertices of a regular 
2n-gon. This problem may be interpreted as the problem of the most efficient 
distribution of n observation posts set up by the inhabitants of a planet for the pur- 
pose of detecting any satellite guided around the planet by intelligent aliens. 

The problem may be rephrased also as follows: Prove or disprove the conjecture 
that if n equal zones cover the sphere then their width ts at Jeast z/n. Here a zone 
of width w is defined as the parallel domain of a great circle of distance w/2. 

In this formulation we are at once led to the following generalization: Prove 
that the total width of any set of zones covering the sphere 1s at least z. 

The problem can be further generalized. Instead of the whole sphere we can con- 
sider a convex spherical domain, i.e., a domain any two points of which can be 
joined by an arc of a great circle lying in the domain. Then we can ask: Is it true 
that if a convex spherical domain is covered with a set of zones of total width w 
then it can be covered with one zone of width w? This problem is a spherical analogue 
of the well-known ‘‘plank problem’’ of Tarski, first solved by Bang [5,6] (see also 
[7, 8, 9, 10, 11, 12, 13]). 

Another spherical analogue of the plank problem, unsolved as yet, was proposed 
by the author ([14], p. 156) more than twenty years ago: Is it true that if a convex 
spherical domain is covered with a set of digons of total area T then it can be covered 
with one digon of area T?) 


References 


1. L. Fejes Toth, Uber die Bedeckung einer Kugelflache durch kongruente Kugelkalotten, 
Mat. Fiz. Lapok, 50 (1943) 40-46. (Hungarian with German abstract.) 

2. K. Schiitte, Uberdeckung der Kugel mit hdéchstens acht Kreisen, Math. Ann., 129 (1955) 
181-186. 

3. G. Fejes Toth, Kreisiiberdeckungen der Sphdre, Studia Sci. Math. Hung., 4 (1969) 225-247. 

4. V. Rosta, An extremal arrangement of three great circles on a sphere, Mat. Lapok, 24 (1973) 
(to appear). (Hungarian with English abstract.) 

5. Th. Bang, On covering by parallel-strips, Mat. Tidsskr. B, (1950) 49-53. 

6. ———,, A solution of the “‘plank problem,” Proc. Amer. Math. Soc., 2 (1951) 990-993. 

7. W. Fenchel, On Th. Bang’s solution of the plank problem, Mat. Tidsskr. B, (1951), 49-51. 

8. H. G. Eggleston, On triangles circumscribing plane convex sets, J. London Math. Soc., 28 
(1953) 36-46. 

9. D. Ohmann, Eine Abschdtzung fiir die Dicke bei Uberdeckung durch konvexe KGrper, J. 
reine angew. Math., 190 (1952) 125-128. 

10. , Kurzer Beweis einer Abschatzung fiir die Breite bei Uberdeckung durch konvexe 
K6rper, Arch. Math., 8 (1957) 150-152. 

11. Th. Bang, Some remarks on the union of convex bodies, Tolfte Skand. Mat.-Kong. Lund, 
1953 (1954) 5-11. 

12. T.-Y. Lee, J.-S. Lin, K.-C. Tong, M.-Y. Zhang, A solution of Bang’s “‘Plank problem.” 
J. Chinese Math. Soc., 2 (1953) 139-143. (Chinese with English abstract.) 

13. N. Bognar, On W. Fenchel’s solution of the plank problem, Acta Math. Acad. Sci. Hung., 
12 (1961) 269-270. 

14. L. Fejes Toth, Lagerungen in der Ebene, auf der Kugel und im Raum, Zweite Auflage, Sprin- 
ger-Verlag, Berlin-Heidelberg-New York, 1972. 


1973] RESEARCH PROBLEMS 1045 
WHAT ARE THE LATIN SQUARE GROUPS ? 


J. J. CARROLL, G. A. FISHER, Bell Laboratories, Indian Hill, Illinois, and 


A. M. OpDLyzxo, N. J. A. SLOANE, Bell Laboratories, Murray Hill, New Jersey 


1. The problem. The following question has arisen in connection with the 
diagnosis of faults in sequential machines [1]. Let G be a permutation group acting 
transitively on the symbols {1,---,n}. When is it possible to find n elements g, = e 
(the identity of G), g,,---,g, Such that for each i the symbols g,(i),---,g,(i) are 
distinct? Such a sequence, when it exists, is called a driving sequence. 

Given any sequence g,,--:,g, of elements of G, consider the square array of 
size n in which the (k, !)-th entry is g,(1). It is easily seen that g,,---,g, is a driving 
sequence if and only if this is a Latin square. 

Conversely, given a Latin square of order n in which the first row is normalized 
to be 1,2,---,m, we can construct a set of n permutations g, = e, g,---,g, acting 
on {1,---,n} which are defined by g,(J) = (k,/)-th entry of the Latin square. Let 
us call a group which can be generated by such a set of permutations a Latin square 
group, or simply Latin. 

As an example, the Latin square 


1 2 3 4 
2 1 4 3 
3 4 1 2 
4 3 2 1 


gives the permutations e, (12)(34), (13)(24) and (14)(23), which generate (in this 
case are actually equal to) the Klein 4-group. 

We now see that a permutation group G contains a driving sequence if and only 
if it contains a Latin square subgroup. The initial problem becomes: which groups 
contain a Latin square subgroup? We also ask: what are the Latin square groups? 
Both of these questions are open. 


2. Known results. A regular group is a transitive group with the property 
that no element, apart from the identity, fixes any symbol. Such a group is a regular 
(or Cayley) representation of a group of order n, and conversely every regular 
representation of a group of order n gives a regular group. 


THEOREM. A regular group is a Latin square group. 


Proof. A regular group G must contain exactly n — 1 permutations with no fixed 
point. Let g, = e, g2,--:,g, be a list of the elements of G. Then g,,---,g,, is a driving 
sequence. 


1046 J. J. CARROLL, G. A. FISHER, A. M. ODLYZKO, N. J. A. SLOANE 


As corollaries, we find that any Frobenius group contains a Latin square sub- 
group, and any abelian transitive group is Latin. 

It can also be shown that for all m, the alternating group on m symbols contains 
a Latin square subgroup, and for all m # 3,4 the symmetric group on m symbols 
is Latin. 

Two permutation groups acting on n symbols are regarded as different permu- 
tation groups if and only if no relabelling of the symbols transforms one into the 
other. 

By examining all the different permutation groups on < 7 symbols ([2], [3], 
[5], [6] for n < 6, [8] for n = 7), and by generating all the Latin square groups 
on < 6 symbols (using the list in [7]) we observed the following. Of the 37 transitive 
groups on <7 symbols, all but three contain Latin subgroups. Those 3 all act 
on 6 symbols and have orders 12, 24 and 60. Of the 30 transitive groups on < 6 
symbols, exactly 15 are Latin. In addition (using [8]) all primitive groups on < 9 
symbols, with the exception of the group of order 60 on 6 symbols just mentioned, 
contain Latin subgroups. 

There are 9408 reduced Latin squares of order 6. (A reduced Latin square has 
its first row and column in lexicographic order.) 7776 of these generate the symmetric 
group. This suggests the conjecture that the probability of the rows of a Latin square 
of order n generating the symmetric group approaches 1 as n > oo. This is supported 
by Dixon’s theorem [4] that two randomly chosen permutations on n symbols 
generate the symmetric group with probability approaching 3/4 as n > oo. 


References 


1. J. J. Carroll, Examination of sequential circuits: A model and a method, Ph.D. Thesis, Dept. 
of Elect. Engin., Illinois Inst. Technology, Chicago, Illinois, May, 1972. 

2. A. Cayley, On the substitution groups for two, three, four, five, six, seven, and eight letters, 
Quart. J. Math., 25 (1890-1891) 71-88, 137-155. 

3. F. N. Cole, Note on the substitution groups of six, seven, and eight letters, Bull. New York 
Math. Soc., 2 (1893) 184-190. 

4, J. D. Dixon, The probability of generating the symmetric group, Math. Z., 110 (1969) 199-205. 

5. G. A. Miller, Memoir on the substitution-groups whose degree does not exceed eight, 
Amer. J. Math., 21 (1899) 288-337. 

6. ———, Historical note on the determination ofall the permutation groups of low degrees, 
Collected Works, Vol. I, Univ. of Illinois Press, Urbana, Illinois, 1935, pp. 1-9. 

7. C.R. Rao, S. K. Mitra, and A. Matthai, Formulas and Tables for Statistical Work, Statistical 
Publishing Society, Calcutta, 1966, p. 193. 

8. C. C. Sims, Computational methods in the study of permutation groups, pp. 169-183 of 
J. Leech, editor, Computational Problems in Abstract Algebra, Pergamon Press, Oxford, 1969. 


CLASSROOM NOTES 
EDITED BY ROBERT GILMER 


Material for this Department should be sent to David Roselle, Department of Mathematics, 
Louisiana State University, Baton Rouge, LA 70803. 


GEOMETRIC FIT OF A MONOTONIC CUBIC 
W. P. Cooke, University of Wyoming 


1. Introduction. The problem considered herein arose during a discussion with 
G. L. Haynes [4] about the use of a cubic to represent the distribution function for a 
certain random variable. This paper, however, deals only with the computational 
problem of the least squares fit of a monotonic cubic constrained to pass through 
the origin and the point (1,1). It should be noted that this problem, as well as the 
more difficult one where the only constraint is monotonicity, may be solved by the 
method of convex programming shown in Hartley, Hocking, and Cooke [3]. The 
paper is, in fact, a computational illustration of a well-known result in constrained 
optimization. After observing the feasible region in Figure 1, it will be clear that even 
an ordinary Lagrange-multiplier approach will solve the problem. 

The author feels that there are many characteristics of the ‘“‘geometric fit’’ 
exhibited herein that would make this problem a good classroom example for a 
course in numerical analysis, statistics, or operations research. Particularly it would 
serve as a graphic introduction to some of the important concepts in nonlinear 
programming. 


2. Formulation of the problem. It is desired to estimate the parameters a;, 
i=0,1,2,3, in the function 
(2.1) Y =o + a,X + a,x? + a3x, 
subject to the conditions that its graph pass through (0,0) and (1,1) and that y’ = 0 
for 0S x $1. Data-points (x, y) will be observed for O< x <1. 

Itis clear that ag = Oanda, + a, + a; = 1. Thus the experimental data is required 
to estimate only two parameters subject to y’ = 0. The monotonicity condition may 
be written in the form 


(2.2) (1, x) *] (. > 0 for0<xK<1. 


A, 3a3 
Using the fact that the matrix of the quadratic form in (2.2) must be positive 
semi-definite along with a, +a, +a,=1, some simple manipulation allows the 
following formulation: 
Estimate a, and a, in the expression 


(2.3) U = 12, + 422, 


1047 


1048 W. P. COOKE [November 


subject to the restrictions 


ay 


IV 
© 
— 


(2.4) a,+a,<1 


3a7 + a3 + 3a,a, — 3a, <0) 
where u= y—x*, z} =x—x?, and z, =x*—x°. Note that z,>0 and z,20 
since OX x X11. 

Inequalities (2.4) specify the convex feasible region S in which the desired 
estimate «*’ = (a7, a3) must be located. S is shown in Figure 1. 


Fic. 1 


3. A geometric least squares solution. The criterion for a “‘best’’ fit will be 
that of constrained least squares. Let u? = y, — x?, n = 2 be the number of data- 
points (x;, y;), and let «’ = (a,,a,). Then, using 


(3.1) U' =(up,us,...,u?) 


and 


1973] CLASSROOM NOTES 1049 


211 221 
(3.2) Za| 712 722 
Zin 22n— 
the problem is to find «* so that 
(3.3) O(a) =(U — Za)’(U — Za) 


is minimum with a*eS. 
If &, the estimate of « obtained by the usual least squares method, is in S, then 
&=a*. If not, then using the 2 x 2 matrix A so that 


(3.4) A'Z'ZA = I, 


and letting B = A~ ‘ta = (b,, b,), it is well known that the point B* on the boundary 
of S in B-space (where Q now has circular contours) nearest B = A~1@ yields the 
desired «* = Af*. The argument is found in Graybill [2], in a more general setting 
in Lewish [5] and Cooke [1], and, more elegantly, in Perlman [6]. 

It is evident now that if &¢S, a geometric solution is obtained by mapping the 
boundary of S into f-space, finding £, drawing a circle tangent to the boundary of S 
with B as center (thus locating B*), and computing «* = Af*. The reader who is 
interested in some statistical properties of the estimator «* is referred to [3]. 


4. An example. The data in Table 1 will be used to illustrate the necessary 
computations. This data has been contrived so that @ #4 a*. 


TABLE 1. Data TABLE 2. Modified Data 
i xj y; i uy Z1i 22; 
0 0 0 1 .12305 .12305 .01367 
1 4/32 1/8 2 .62119 15244 .02060 
2 5/32 5/8 3 .08007 .20507 .09570 
3 28/32 6/8 
4 1 1 


The data converted to (u, z,, z,)-data appears in Table 2. Note that the (x, y)-data 
when i = 0 and i = 4 has already been used to yield ay = 0 and a, + a, +a; =1. 
Five-decimal accuracy was arbitrarily used. 

Using the data from Table 2 the unconstrained least squares estimate of « is 
&’ = (3.68,-6.94), with two-decimal accuracy being used for locating @ in Figure 1. 


1050 W. P. COOKE [November 


A quick reference to Figure 1 shows that &¢S, so we proceed to the geometric 
solution for «*. 

The only arithmetic problem is the discovery of the upper-triangular matrix A 
so that A’Z’ZA = I,, where of course Z is known from the experimental data. Since 
regardless of the number of data-points Z’Z isa 2 x 2 matrix, it is not at all difficult 
to find A. 

From the z-data in Table 2, 


3.52609 — 6.28455 
an | 


0 20.67397 
and A~!=A’Z’Z is 


28360  .08621 
(4.2) A-t = ( ). 


0 04833 


The mapping of the boundary of S to B-space is accomplished by either substituting 
AB for « in (3.3) and sketching the resulting ellipse or mapping point-by-point using 
B = A~*a. One should note that while the ellipse in Figure 1 is always the same the 
mapping will depend on the particular set of experimental data observed. Figure 2 
shows the result of the mapping for the data in Table 1 along with the solution for f*. 


Fic. 2 


As well as can be read from Figure 2, B*’ =(.45,-.31). Then «* = Af* gives the 
desired solution 


(4.3) a*’ = (3.54,-6.41). 


Finally, from a, + a, + a3 = 1, the estimate a; = 3.87 is obtained and the monotonic 


1973] CLASSROOM NOTES 1051 
cubic is 
(4.4) y = 3.54x — 6.41x? + 3.87x°3. 


References 


1. W. P. Cooke, Convex Programming Applied to the Estimation of the Parameters of Definite 
Quadratic Forms and to Related Tests of Hypotheses, Ph.D. Thesis, Texas A and M University, 
1968. 

2. F. A. Graybill, An Introduction to Linear Statistical Models, McGraw-Hill, New York, 1961 
(p. 112). 

3. H. O. Hartley, R. R. Hocking, and W. P. Cooke, Least squares fit of definite quadratic forms 
by convex programming, Management Science, 13, No. 11, July (1967) 913-925. 

4. G. L. Haynes, Quantification of expertise, The Determination of Empirical and Analytical 
Spacecraft Parametric Curves, Progress Report II, Part 6, NASA Grant SC-NGR-44-001-027, 
May, 1966. 

5. W. T. Lewish, Linear Estimation in Convex Parameter Spaces, Ph.D. Thesis, Lowa State 
University, 1963. 

6. M. D. Perlman, One-Sided Testing Problems in Multivariate Analysis, The Annals of Mathe- 
matical Statistics, Vol. 40, No. 2, April (1969) 549-567. 


A FAMILIAR COMBINATORIAL IDENTITY PROVED BY COMPLEX ANALYSIS 
STEVEN MINSKER, Massachusetts Institute of Technology 


We shall prove that 


(o) #1) # = + G) =) 


for all non-negative integers n. Let z be a complex variable, let = {z : |z| = 1}, 
and let | and j be integers. Note that 


1 


—— | Z2'Z4dz = 
2ni r 


1 ifl=j 
0 if 1 As{ 


mf, aermer ire (eC) eo) 


as is seen by replacing (z + 1)" and (Z + 1)" by their binomial expansions and in- 
tegrating term-by-term. But 


Then 


1 _ n= n it = =\n 
aa |. 2(z + IZ + Iidz =~ [ z(2+2+ 2Z)'dz 


— 2 n 2n 
1 Z(2z + z* + 1) de = 1 (z + 1) de 


ani Jr z" ani Jr zhti 


By Cauchy’s integral formula, this last expression is just the nth derivative of 
(z + 1)*"/n! evaluated at zero. But this is just (2n)(2n — 1)---(n + 1)/n!, or (7"). 


1973] ELEMENTARY PROBLEMS AND SOLUTIONS 1057 


What role should the computer play in mathematics education? Is it an instructional 
aid? Is programming skill as basic as arithmetic? How much programming and 
numerical analysis should be a part of the required curriculum? 

IfI. Professional activities. How can two-year college faculties maintain their 
mathematical interests? How can junior members progress toward higher degrees? 
How can tyc administrators come to recognize that time spent in professional mathe- 
matical activity is as important to the life of the department as time spent in the 
classroom? Is the manpower now available in the large number of young Ph.D.s 
a threat or a boon to present tyc departments? How can universities better train 
their graduate students for tyc positions? 

We value your additions, comments, opinions, and recommendations. Please 
send them to the Committee through Mr. Chinn. 


PROBLEMS AND SOLUTIONS 
EDITED BY Emory P. STARKE 


ASSOCIATE EDITORS: JOSHUA BARLAZ, Eric S. LANGFORD. COLLABORATING EDITORS: LEONARD 
CARLITZ, GULBANK D. CHAKERIAN, HASKELL COHEN, S. ASHBY FOOTE, ISRAEL N. HERSTEIN, 
Murray S. KLAMKIN, DANIEL J. KLEITMAN, ROGER C. LYNDON, MARVIN MARCUS, 
CHRISTOPH NEUGEBAUER, ALBERT WILANSKY, AND UNIVERSITY OF MAINE PROBLEMS GROUP: 
EARL M. L. BEARD, GEORGE S. CUNNINGHAM, CLAYTON W. DODGE, OSKAR FEICHTINGER, 
WILLIAM R. GEIGER, RAMESH GUPTA, GARY HAGGARD, PHILip M. Locke, JOHN C. 
MAIRHUBER, CURTIS S. Morse, GRATTAN P. MURPHY, EDWARD S. NORTHAM AND 
WILLIAM L. SOULE, JR. 

All problems (both elementary and advanced) proposed for inclusion in this Department should 
be sent to E. P. Starke, 1000 Kensington Ave., Plainfield, NJ 07060. Proposers of problems 
are urged to enclose any solutions or information that will assist the editors. Ordinarily, prob- 
lems in well-known textbooks and results in generally accessible sources are not appropriate 
for this Department. No solutions (except those accompanying proposals) should be sent to 
Professor Starke. 


ELEMENTARY PROBLEMS 


Solutions of Elementary Problems should be sent to Problems Group, Mathematics Department, 
University of Maine, Orono, ME 04473. To facilitate their consideration, solutions of Elemen- 
tary Problems in this issue should be typed (with double spacing) and should be mailed before 
February 28, 1974. 


An asterisk (*) means neither the proposer nor the editors supplied a solution. 


E 2438*. Proposed by Bernardo Recaman S., Colegio San Carlos, Bogota, 
Colombia 


For each natural number n, let f (n) denote the least perfect square that begins 


1058 ELEMENTARY PROBLEMS AND SOLUTIONS [November 


with the digits of n; e.g., f(2) = 25, f(4) = 4, f(5) = 529, f(12) = 121. Define 
g(n) = max(f(1),--:,f(n)). Do there exist arbitrarily long sequences of consecutive 
integers on which g is constant? 


E 2439. Proposed by Edmund Umberger, Pennsylvania State University 


Let : and x denote the usual dot and cross product of vectors in 3-space, and 
let * denote any of the following operations: the usual scalar multiplication of scalar 
and vector, multiplication of vector and scalar with the obvious interpretation, 
or ordinary multiplication of scalar and scalar. How many of the 3” ways of in- 
serting the symbols-, x , and * between consecutive vectors of the string 0,0, ---v,44 
will result in meaningful expressions by suitable insertion of parentheses? For 
example, v,; x v, * v,°v, Can be made meaningful, whereas v, * v,-v, + v, cannot. 


E2440. Proposed by R. C. Entringer, University of New Mexico, and D., E. 
Jackson, Los Alamos Scientific Laboratories 


Does every permutation of the integers 0, 1, ---, n contain an arithmetic progression 
of at least three terms? 


E 2441. Proposed by Cornelius Groenewoud, Snyder, N.Y. and F. K. Hwang, 
Bell Telephone Laboratories 


One has n locations and m teams of n players each. Every week, each team is to 
send one player to each location where a resolvable round-robin tournament is 
conducted. Show that it is possible to construct an n-week schedule such that every 
player goes to every location exactly once, and each player plays against every other 
player on every other team exactly once, whenever n is a prime power which exceeds 
Mm, 


FE, 2442. Proposed by J. C. Hemperly, University of Maryland 
Let w,, @2,--+,@, denote the nth roots of unity. Evaluate 
X|@;, — a;|~?, 
the sum being taken over all distinct i, j. 
E 2443. Proposed by T.M. Apostol, California Institute of Technology 


Let f, and f, be two linearly independent functions which are continuous on the 
bounded interval [a,b]. Show that for every pair of constants c, and c, there exists 
a continuous function h on [a, b| such that 


b 


|. h(x) f,(x)dx = c, and | h(x) f,(x)dx = cp. 


a a 


1973] ELEMENTARY PROBLEMS AND SOLUTIONS 1059 
SOLUTIONS OF ELEMENTARY PROBLEMS 
Divisors of Powers Plus 1 


E 2378 [1972, 906]. Proposed by D. E. Penney, University of Georgia 
Let a,m and n be natural numbers. Evaluate (a" + 1, a" + 1). 


Solution by Alan Stein, University of Connecticut at Waterbury. Letting 
g =(a" +1, a" + 1), the solution is 


g= amy] if mnj|(m, ny is odd, 
g = 1 if mn/(m, n)’ is even and a is even, 
g = 2 if mn/(m, n)* is even and a is odd. 
We first establish a lemma: If (m,n) = 1, m>n, and e,f = + 1, then 
(a" + e,a"+ f) =(a"+e—ef(a"+f), a" +f) = (a" —efa", a" +f) 
= (a"-"— ef, a" +f) since (a", a"+ 1) = 1. 


Note that both — ef and f are — 1 only when both e and f are — 1. 

When (m,n) = 1, repeated application of the lemma produces g equal to either 
(a+1,a+1)or(a+1,a—1), whence g = a+1,org = 1, org = 2. If k is odd, 
then a +1 divides a“ +1; so, if both m and n are odd, then g = (a+1,a +1) 
= a+41. But if k is even, then a + 1 divides a“ — 1, so a +1 is not a divisor of 
a“ +1 unless a = 1; hence g = (a +1, a — 1) if either m or n is even. But (a +1, 
a — 1) equals 1 if a is even and 2 if a is odd. 

Now let d = (m,n), and note that (a"+ 1, a"+1) = ((a%)™4 +1, (a%)"4 +1) 
with (m/d, n/d) = 1. The solution follows. 


Also solved by W. F. Buckler, John Coolidge, R. B. Eggleton, M. G. Greening (Australia), L. 
Kuipers, J. G. Mauldon, Oto Strauch (Czechoslovakia), Guy Torchinelli, and the proposer. 


Editorial Note. O. H. Fraser refers to Exercise 590 in Faddeev & Sominskii, Problems in Higher 
Algebra, which asks for (x™ + a™, x" + a") and incorrectly omits the answer 2 when x and a are 
both odd and mn/ (m, n)2 is even. Coolidge points out that, when k is odd, a* + 1 = ak —(—1)*, 
so that the first case follows from the solution to E2295 [1972, 398]. Four partial solutions were 
received. 


Matrices with A > Oand A~! > 0 
E2379 [1972, 1033]. Proposed by H. Kestelman, University College, London, 
England 


Find all matrices A such that both A and A-! have all elements real and non- 
negative. 


1060 ELEMENTARY PROBLEMS AND SOLUTIONS [November 


Solution by Bennett College Team. A and A~* have all elements real and non- 
negative if and only if each row and each column of A has exactly one positive 
element and the rest of the elements are zeros. 


Proof. Suppose first that the n x n matrix A = (a,;) has exactly one positive 
element in each row and each column and that all other elements are zeros. Define 
B = (5;;) in the following way: b;; = Oif a;; = 0 and b,; = 1/a,,if a,;, # 0. Clearly 
B = A~'*, and B has all elements real and nonnegative. 

Conversely, let A = (a,;;) and B = (b,;) be n x n matrices with real nonnegative 
elements such that AB = I, and assume that one row of A has two or more positive 
elements, say a;.>0anda,;,>0, r #4 s. Then if j ¥ i, 


0 = » Ain Dy j = + + Apby; + Ajsdgj Fo - 
k=1 


Therefore b,; = b,; = 0 for j 4 i which implies that B is singular, a contradiction. 
Similarly the assumption that a column of A has two or more positive elements 
leads to a contradiction. 


Also solved by D. T. Adams, K. F. Andersen, Young Archimedes, C. M. Bang, Frederick Carty, 
John Christopher, R. E. DeMarr, Marjorie Fitting, R. A. Gibbs & L. S. Johnson, J. Z. Hearon, 
Melvin Henriksen, G. A. Heuer, D. M. Jordon, E. S. Lander, Detlef Laugwitz (Germany), Joel Levy, 
Milan Lustig (Czechoslovakia), Carolyn MacDonald, Andrzej Makowski (Poland), Jack Neems, 
William Nusslein, John O’Neili, Robert Patenaude, Kenneth Rosen, D. S. Rubin, W. C. Stone, Phil 
Tracy, M. R. Vitale, R. M. Warten, G. P. Wene, W. W. Williams, and the proposer. 


Editorial Comment. Both Makowski and DeMarr point out that the above result is given in [1] 
and DeMarr also generalizes the result in [2]. Adams demonstrates in [3] the related result that a 
compact topological group of matrices with nonnegative elements consists entirely of permutation 
matrices. He comments that this yields the interesting corollary that such a group must be finite, a 
result which has implications in algebraic topology (see [4]). 


References 


1. T. A. Brown, M. Juncosa and V. Klee, Invertibly positive linear operators on spaces of continuous 
functions, Math. Ann., 183 (1969) 105-114. MR 42 # 8314. 

2. R. E. DeMarr, Nonnegative matrices with nonnegative inverses, Proc. Amer. Math. Soc., 35 
(1972) 307-308. 

3. D. T. Adams, On Banach lattices and groups of positive matrices, to appear. 

4. D.R. Brown, On clans of nonnegative matrices, Proc. Amer. Math. Soc., 15 (1964) 671-674. 


Polynomial Roots Having Rational [maginary Part 


E2380 [1972, 1033]. Proposed by Erwin Just, Bronx Community College 


Let f(x) be an irreducible polynomial of degree at least three with rational co- 
efficients, and suppose that f(x) has precisely two non-real zeros, z,; = p+ qi 
and z, = p— qi, where p and q are real. Could q possibly be rational? 


1973] ELEMENTARY PROBLEMS AND SOLUTIONS 1061 


I. Solution by Allen Charnow and Hwa Tang, California State University 
at Hayward. Let K be the splitting field of f(x) over Q, the rationals. Let ¢ be an 
automorphism in the Galois group G(K,Q) such that $(z,) = r, where r is a real 
zero of f. Then 


O(Z1) — O(Z2) = P21 — 22) = O(2gi) = $(2q) OD). 


Assume now qgéQ. Then ie K, and (i)? = @(i?) = —1, so o(i) = +i. Thus 
o(z,) = r + 2qgi. But no zero of f is of the latter form, a contradiction. Hence q 
cannot be rational. 


II. Solution by J. Ernest Wilkins, Jr., Howard University. We shall infer a 
negative answer to the proposed question from the following more general result. 


THEOREM. Let Q[ x] be the set of polynomials in x with coefficients in the field 
Q of rational numbers. If f(x) isin Q| x], is irreducible over Q, and has two complex 
zeros of the form p + ig with real p and q*€Q, q # 0, then there exists an irre- 
ducible d(x) in Q[x] and a constant ayéQ such that f(x) = aod(x + ig)d(x—iq), 
d(p) = 0. In particular the degree of f(x) is even. 


Without loss of generality we may assume that f(x) is a monic polynomial and 
denote its degree by n. Since q*e€Q, it is clear that f(x + iq) = F(x) + iqG(x), in 
which F(x) and G(x) are in Q[x], F(x) is monic, the degree of F(x) is n, and the 
degree of G(x) is at most n — 1. Since f(p + iq) = O and pis real, F(p) = G(p) = 0, 
and so the (monic) greatest common divisor d(x) of F(x) and G(x) has degree at 
least 1, and at most n—1. Therefore f(x + iq) = d(x)[@(x) + iqw(x)], in which 
d(x), @(x) and w(x) are all in Q[x], and so f*(x) = H(x)K(x), in which 
H(x) = d(x + iq)d(x — iq) is in Q[x] and has degree not less than 2 and not more 
than 2(n — 1), and K(x) = {f(x + iq) — iqw(x + ig)} {o(x — ig) + iqw(x — ig)} is 
also in Q[x]. 

Let L(x) be the (monic) greatest common divisor of f(x) and H(x). Since f(x) 
is irreducible, either L(x) = 1 or L(x) = f(x). In the first case, f(x) and H(x) are 
relatively prime and so there exist polynomials «(x) and f(x) in Q[x] such that 
af + BH = 1. Squaring, and using f? = HK, we see that H(a7?k + 2aBf + B*H) = 1, 
and this is impossible since the degree of H is at least two. Therefore L(x) = f(x), 
H(x) is divisible by f(x), and H(x) = f(x)A(x) for some monic A(x) in Q[x], implying 
f*(x) = f(x)A(x)K(x), and so f(x) = A(x)K(x). Since f(x) is irreducible, either 
A(x) = f(x), in which case H(x) = f7(x), an impossibility since the degree of H is at 
most 2(n — 2) and the degree of f? is 2n, or A(x) = 1, in which case f = H. This 
is the desired conclusion since the reducibility of d(x) obviously implies that of f. 

Returning now to the original question, let p, (k = 1,2,---,n—2) be the zeros 
of f(x) in addition to z, and z,. Since f(x) is irreducible these zeros are distinct 
from each other and from z, and z,, and for each k, either p, — ig or p, + ig is 
a zero of d(x) since f(x) = aod(x + iq)d(x —iq) and ay #0. By hypothesis p, 


1062 ELEMENTARY PROBLEMS AND SOLUTIONS [November 


is real and so both p, — iq and p, + ig are zeros of d(x), which therefore has at least 
2(n — 2) + 1 zeros, namely p, and p, + ig, for k = 1,2,---,n — 2. Since the degree of 
d(x) is n/2, it follows that n/2 2 2n — 3, implying n < 2, contrary to the hypothesis 
that n => 3. We infer that itis not possible for g?to be rational and, a fortiori, it is 
impossible for q to be rational. 


Also solved by Robert Gilmer, H. K. Schmidt (Germany), J. H. Smith, and the proposer. Partial 
solution by Michael Goldberg. 


(A, i) Fails in C[O, 1] 
E 2381 [1972, 1035]. Proposed by E. S. Langford, University of Maine 


Suppose that {f,} is a sequence of continuous real-valued functions defined on 
[0,1] such that f,(x)2 f,(x) 2 --- 2 0 for all x e [0,1]. Suppose further that the only 
continuous function f such that f,(x) 2 f(x) 2 0 for all x e[0,1] and all n = 1,2,--- 
is the zero function. Is it necessarily true that 


1 
| f,(x)dx 70 as n>? 
JO 

I. Solution by Norman Wilson (graduate student), University of Pittsburgh. 
In B. R. Gelbaum and J. M. H. Olmstead, Counterexamples in Analysis (Holden-Day, 
San Francisco, 1964, p. 106) there is exhibited a monotonically decreasing sequence 
{f,} of continuous functions which converges pointwise to the characteristic function 
Yc of a Cantor set C of positive Lebesgue measure m(C) (op. cit., pp. 88-89). This 
provides a counterexample to the assertion, for by the Bounded Convergence Theorem, 


[, foods | ic dm = m(C) > 0. 


Yet suppose that f is continuous and that f(x) 2 f(x) 20 for every xe[0,1] and 
every n=1,2,---. If f(x9) were positive for some x) €[0,1], then by continuity, f 
would be strictly positive on some open subinterval I of [0,1]. That is, for every 
xel,0 <f(x) S inf, f(x) = lim, f,(%) = xc(x) implying that the Cantor set C contains 
an interval, a contradiction since C is nowhere dense. 


IJ. Solution by H. Kestelman, University College, London, England. The 
answer is no. First cover the rational points of [0,1] by a sequence of open intervals 
whose lengths sum to 4. Let K denote the subset of [0,1] not covered by these 
intervals and for xe[0,1] and n = 1,2,--- set f,(x) = (1 — d(x, K))", where d(x, K) 
denotes the distance of x to the set K. (Note that K is a closed nowhere-dense set.— 
Ed.) Then each f,, is continuous, the sequence { f,} is monotonically decreasing, and 
for every n = 1,2,--- 


{, Iu(x) dx 24 


1973] ELEMENTARY PROBLEMS AND SOLUTIONS 1063 


since f,(x) = 1 for every xe K, and K has Lebesgue measure = 4. Suppose that f is 


continuous and that 0 < f(q) Sf,(q) for every n and every rational gq e[0,1]. Then 
0<f(q) S lim, f,(q) = 0 since 1 — d(q, K) < 1 for all rational g. It follows from the 


continuity of f that necessarily f(x) = 0 for all x €[0, 1]. 


III. Comment by the proposer. The space C[0,1]| of continuous real-valued 
functions on [0,1] is a Riesz space under the pointwise operations and under the 
integral (L*) norm this space enjoys all of the properties of an abstract L-space except 
for completeness. It is known that in an abstract L-space, whenever { f,} is a decreasing 
sequence of positive elements, then f, > 0 in the order sense if and only if f, > 0 in 
norm. The problem asks if this is still true in C[0,1], and the answer is no as shown 


above. 
Professors W. A. J. Luxemburg and A. C. Zaanen have kindly pointed out that 


this problem appears as Exercise 18.14 (p. 104) of their book, Riesz Spaces (Vol. I), 
North Holland, Amsterdam, 1971. The result was also noted in the same authors’ 
earlier paper, Notes on Banach function spaces (X), Nederl. Akad. Wetensch, Proc. 
Ser. A67 = Indag. Math. 26(1964), 493-506. In general, a normed Riesz space is 
said to have Property (A, i) by these authors if || f, | 0 whenever f, | 0. 


Also solved by John Annulis, Sheldon Axler, Anders Bager (Denmark), Fred Barber, Gerald 
Beer, T. A. Bick, Robert Breusch, Benjamin Burrell, Frederick Carty, R.C. Detmer, Harley Flanders, 
Leon Gerber, Gary Gundersen, M. L. J. Hautus (Netherlands), Ellen Hertz, Lee Hill, Lee Keener, 
J. A. Kelingos, P. G. Kirmser, E. M. Klein, Detlef Laugwitz (Germany), Joel Levy, John MacBain, 
Glenn Meyers, L. F. Meyers, Paul Milnes, S. E. Minear, William Nuesslein, John Oman & A. L. 
Perrie, M. Bhaskara Rao (England), Jiirg Ratz (Switzerland),The Ronald Reagan Memorial Problems 
Group of CSU (Hayward), L. A. Ringenberg, Kenneth Rosen, Sullivan Ross, T. Salat (Czecho- 
slovakia), Nan-Shan Shou, Michael Steiner (Sweden), Manfred Stoll, John Swetits, U. R. U., R. M. 
Warten, M. Weiss & C. Wexler & M. Goldstein, J. K. Yates, and the proposer. Eight (incorrect) 
affirmative answers to the question were received. Most attempted to use Lebesgue’s Dominated 
Convergence Theorem, sometimes in conjunction with Egoroff’s Theorem. Two incorrect counter- 
examples were received. 

Editorial Comment. Allof the solutions submitted depended on the existence of a closed nowhere- 
dense subset C of [0,1] of positive Lebesgue measure. In all circumstances, the subset presented was 
either the classical Cantor set as used by Wilson or the subset presented by Kestelman constructed 
by deleting open intervals about the rationals (or any countable dense subset.) Once the set C was 
found, the construction of the monotonically decreasing sequence { f,; of continuous functions con- 
verging pointwise to %- proceeded in basically one of four directions: (1) Citing the result from 
Gelbaum and Olmstead; (2) Constructing f, in a piecewise linear fashion; (3) Using a d(x, C) con- 
struction as in Kestelman’s solution ; (4) Noting that C is closed so that it is a G, set and hence there 
exists a sequence G; = G2 =... of open sets such that 1G, =C. Urysohn’s Lemma was then 
used to show the existence of continuous functions g, such that g,(x) = 1 if xe Cand g,(x) = 0 
if x € [0, 1]\G,, and f, was then taken to be either gi g2...g, or gi \ g2 /\\--- A &n- 

Swetits comments that if { f,} is a monotonically descreasing sequence of continuous functions 
satisfying the conditions of the problem and if fis the pointwise limit of the f,, then necessarily f(xo) 
= 0 at every point of continuity xo of f. It follows that if fis Riemann-integrable, then necessarily 
f = 0 almost everywhere so that f f(x) dx = 0. This means that there can be no counterexample 
with a Riemann-integrable limit function. (Note that if C is a closed nowhere-dense set of positive 


1064 ELEMENTARY PROBLEMS AND SOLUTIONS [November 


measure, then the set of discontinuities of the characteristic function %¢ is precisely C—a set of 


positive measure.) 
On a lighter note, Kelingos remarks that “Operating on the premise that any pathological 
situation in real variables that mortal man can conjecture is possible, the answer is no.” 


Shades of E 712 
E 2382 [1972, 1034]. Proposed by Thomas Hughes, Fort Worth, Texas 


One has a number of balls, identical in appearance; one of the balls is known to 
be slightly heavy, another slightly light by the same amount, and the rest have a 
standard weight. It is desired to isolate both the light and heavy balls, using only 
three weighings on a “‘triple platform balance.’’ (A triple platform balance consists 
of three arms forming a Y, equally spaced at intervals of 120°; these are supported 
at the center, and at the end of each arm is a pan. If n balls are placed in each of 
the three pans, than one can tell whether each of the three sets of balls is heavier, 
lighter, or the same weight as n standard balls; note however that the heavy ball 
and the light ball weigh as much as two standard balls.) 

What is the largest number of balls from which one can identify both the heavy 
ball and the light ball in only three weighings? 


Composite solution edited from those given by Guy Torchinelli, SUNY at 
Buffalo, and O. P. Lossers, Technological University, Eindhoven, The Netherlands. 
As a result of the first weighing, the set of balls is divided into four subsets, namely 
the three subsets formed by balls weighed in the three pans and the subset of un- 
weighed balls. As a result of the second weighing, each of these four sets is divided 
into four subsets. If the first two weighings balance, then each of these sixteen sets 
has standard weight. There are 13 possible outcomes for the last weighing: equilibrium 
(0, 0, 0,0), and 12 permutations of one pan heavy and one pan light (+, —,0,0). The 
third weighing, then, cannot locate the bad balls among the 16 subsets, when the 
first two weighings both yield equilibrium, if: 

(1) One of the sets contains at least 5 balls (since there are 5-4 = 20 possible 
choices for the bad balls). 

(2) One set contains 4 balls and another set at least 2 balls (4-3+2=14 
choices). 

(3) Each of two sets contains 3 balls (even though there are only 3-2 + 3-2 = 12 
choices). 

(4) One set contains 3 balls and each of four sets contains 2 balls (3:2 + 4-2 = 14 
choices). 

(5) Each of seven sets contains 2 balls (14 choices). 

When 23 or more balls are separated into sixteen subsets, there are 7 extra balls 
over and above one ball per subset. Thus one of the five indeterminate situations 
occurs, so three weighings will not suffice for 23 balls. 


1973] ELEMENTARY PROBLEMS AND SOLUTIONS 1065 


We now show it possible to make the decision in 3 weighings if there are 22 balls. 
Consider the array below where the set in row i and column j is weighed in pan i in 
the first weighing and in pan j in the second weighing (“‘pan 4’’ contains those balls 
set aside). 


Weighing 2 
Weighing 1 Pan 1 Pan 2 Pan 3 Pan 4 
Pan 1 1,2 3,4 5 6 
Pan 2 7 8,9 10, 11 12 
Pan 3 13,14 15 16,17 18 
Pan 4 19 20 21 22 


If each of the first two weighings results in a permutation of (+, —,0,0), then we 
are left with at most 4 balls from which we are to find the lighter and the heavier in 
the final weighing, and this is trivial. If the first weighing results in (0,0,0,0) and the 
second in (+, —,0,0), for example, then in the same row of the above array the 
heavy ball is in the first column and the light ball is in the second column. It is a 
simple matter to check that a final weighing of {1,7, 8}, {2,9, 15}, {3, 13, 19}, {4, 14, 20} 
in the four pans locates the bad balls. Note that the last column, though not of the 
same type as the others, may be changed, when necessary, by adding to it balls 
known to be of the correct weight. Finally, if the first two weighings both balance, 
then the bad balls are one of the six pairs found in an intersection of a row and a 
column. They are easily located in the third weighing of {1,10,16}, {3,11, 13}, 
{8, 14,17}, {2,4, 9}. It follows that the required decision can be made for any number 
of balls not exceeding 22. 


Also solved by Jordi Dou (Spain), H. S. Hahn, Peter Klein (Sweden), H. L. Nelson, Kenneth 
Rosen, and the proposer. 

Editorial Note. Ten balls was the most common of the 7 incorrect answers received, and 64 the 
largest. Two solvers report that two weighings suffice for 9 balls, but, curiously, not for 8. For 
9 balls, weigh {1,2}, {3,4}, {5,6}, {7,8,9} in the first weighing. If (0,0,0,0), then weigh 
{1,7}, {3,8}, {5,9}, {2, 4,6} the second time; if( +,0,0, —) the first weighing, then weigh {1, G}, 
{2, 7, (8, G}, {9} the second time, where G denotes any known good ball. The other cases are 
easy. For 8 balls, which can be done in two weighings, eliminate ball 9 on the first weighing above. 
If (0, 0, 0, 0), then use {1, 5}, {2, 3}, {4, 7}, (6, 8} in the second weighing. Other cases are trivial. 

With one weighing, four balls is the maximum. 

Walter Bluger notes the similarity of this problem to problem E 712 [1947, 46] for a two-pan 
balance. 

A general solution for four or more weighings is solicited (see E 712, for example). By the subset 
argument above, three weighings of equilibrium produce 64 subsets, so 64 + 6 = 72 balls is an 
upper bound for four weighings. Other considerations seem to indicate a smaller least upper bound. 
The editors have a solution for 60 balls in four weighings, copies of which can be obtained by writing 
to the Problems Group. 


1066 ELEMENTARY PROBLEMS AND SOLUTIONS 
Catalan Numbers 


E2383 [1972, 1034]. Proposed by E. T. Ordman, University of Kentucky 
Let n be a nonnegative integer. For p = 1,2,--- define 


son = E(t) (ll 


where we make the usual conventions regarding binomial coefficients. It is easy 
to evaluate S,(n). Evaluate S,(n). 


Solution by Richard Gibbs and Harold Stocker, Fort Lewis College, Durango, 


Colorado. Let 
n+1 n n 2 
T = — . 
m= = [()- (2a) 


By the symmetry of the binomial coefficients, T(n) = 2S,(n). Now 


ntt Tin \? n n n \? 
ro) = = |(Z) -2(7) (a) + (ea) | 
From the identity @) +(31) = (";') we obtain 


5(” ")- net) a n\* _ ") 
k}) \k—-A} k k k—-A1)} ° 
nt+1 2 n+l 2 n+1 2 
+1 
ry =2E (7) +2zE(,",) - 2 (" ). 
M2 a) tem Neat) 7 hk 
Using the identity 2” 9(%)’ = (4) and the fact that (",) = (,¢1) = 0 we obtain 


roy = 02) 42) 2[e) 


Therefore, 


Therefore sn) = mn) - on 4 2) 
aw Nn n—-1) nt+1\n/)° 


Thus S,(n) is just the nth Catalan number. 


Also solved by Giinter Bach (Germany), Anders Bager (Denmark), M. T. Bird, D. M. Bloom, 
Dieter Bode (Germany), Frederick Carty, H. W. Gould, M. G. Greening (Australia), Robert Heller, 
J. D. Hiscocks, O. P. Lossers (Netherlands), Alexandru Lupas (Romania), Milan Lustig (Czechoslo- 
vakia), Kumer Murty & Ram Murty, M. R. Railkar (India), Jiirg Ratz (Switzerland), Kenneth 
Rosen, F. C. Smith, Phil Tracy, E. Trost (Switzerland), David Zeitlin, and the proposer. 


ADVANCED PROBLEMS 


All solutions of Advanced Problems should be sent to J. Barlaz, Rutgers — The State University, 
New Brunswick, N. J. 08903. Solutions of Advanced Problems in this issue should be typed (with 
double spacing) on separate, signed sheets and should be mailed before February 28, 1974. 


An asterisk (*) means neither the proposer nor the editors supplied a solution. 
5934. Proposed by R. C. Buck, University of Wisconsin 
What entire functions f obey the inequality 
f(z+a)| <|f@| |f@| 
for all z and for two non-zero choices of a whose ratio B = a, /a, is irrational? 
5935*. Proposed by K. Selucky, Brno, Czechoslovakia 
Suppose 


1 1 1 1 1 1 1 1 
— +——— + +— = +—+—+ ; 
X, S—X, S—Xz3 Xe S—Xy X% Xp SH 


where $s > X, 2X, 2%32%42345. Prove x, = xX, and x3 = X,. 
5936. Proposed by David Styer, University of Cincinnati 


Is there a function f, bounded and holomorphic in | z| <1, and a polynomial p 
such that, in | z | <1,f/p assumes every complex value infinitely often with at most 
one exception? — 


5937. Proposed by Albert Wilansky, University of Reading, England 


Give an example of a vector space with two comparable (unequal) norms such 
that it is barreled with the larger norm and a Banach space with the smaller. [A 
barreled (locally convex topological vector) space is one such that every closed 
absolutely convex absorbing set is a neighborhood of 0.] 


5938. Proposed by Detlef Laugwitz, Nieder-Ramstadt, Germany 


Let G be an archimedean subfield of the ordered field F, and let A be an order 
preserving automorphism of F. Is it true that AG = G? 


5939. Proposed by J. W. Andrushkiw, Seton Hall University 


Let f(z) =a) + a44Z +--+ a,2Z" be a polynomial with real positive coefficients. 
Show that if for some k, OS k <n — 3, the inequality a,a,43 2 3a,4 44,42 holds, 
then f(z) cannot have all of its zeros located in the complex left halfplane. 


1067 


1068 ADVANCED PROBLEMS AND SOLUTIONS [November 
SOLUTIONS OF ADVANCED PROBLEMS 
Metrizability from a Neighborhood Base 


5859 [1972, 524]. Proposed by L. A. Feldman, Stanislaus State College, 
California 


Prove that a T, topological space (X,7T) is a metric space if and only if each 
xe X has a neighborhood base of open sets 


{B,(x) |r € (0, 1]} 


such that (1) if r,se[0,1] with r<s, and Bo(x) = {x} then B(x) cB,(x); (2) if 
Bx) OB(y) ¥ @ for r,se[0,1], where O0<r+s<1, then for some t where 
O<t<r+s, we have xeEB,(y). 


Solution by D. G. Belanger, University of South Alabama. We first show that 
X is T,. Let {x,;} be a net approaching distinct points x and y. From property (2), 
x € By) and ye B(x) for every te(0,1], hence, because X is Ty, x = y. Let 


d(x, y) = min(1, inf {r| y € B,(x)}). 


Since X is T,, d(x, y) = 0 if and only if x = y. To prove symmetry assume that 
d(x, y) =r, d(y,x) =s and r+e<s for some x,yeX, ¢>0. Since B,, (x) N Bo(y) 
> {y}, xe B,,,(y) contradicting the assumption. Thus d(x, y) = d(y, x). If ye B,(x) 
and ye B(z) then zé€ B(x) where 0<t<r+s by property (2). This proves the 
triangle inequality d(x,z) <d(x,y)+d(z,y). Since B(x) = {y | d(x,y)<r}, the 
metric d induces the topology on X. 


Also solved by R. A. Christiansen, David Singmaster (England), P. van der Steen (Nether- 
lands), R. K. Tomaki, and the proposer. 

Note. Tomaki and Christiansen point out that this result is a consequence of the metrization 
theorem of A. H. Stone to be found on p. 196 of Dugundji, Topology. 


Submodules as Direct Summands 


5862 [1972, 667]. Proposed by R. C. Wagner, Fairleigh Dickinson University 


A submodule N of the R-module M is said to be pure if for every re R, rN 
= N OrM. Prove that if R is a commutative Noetherian ring with unit and M isa 
finitely generated R-module for which every submodule is pure, then every submodule 
is a direct summand of M. 


Solution by James R. Smith, Appalachian State University. Suppose some 
submodule of M is not a direct summand. M is Noetherian since it is a finitely 
generated module over a Noetherian ring. So let N be a maximal element in the set 
of all submodules of M which are not direct summands. Then, if xe M — N, we 
must have Rx ON € 0, for, otherwise, N would be a summand since N + Rx is a 


1973] ADVANCED PROBLEMS AND SOLUTIONS 1069 


summand. Let xe M — N. Then N + Rx = M, for N + Rx is a summand of M, so if 
P is acomplementary summand for N + Rx and ye P, by the above if y 4 0 we have 
Ry ON #0. But PON =0, so y=0. Now let zeM-—WN be such that N:z 
({a € R| az N}) is maximal in the set of all N: y for ye M —N. Let N:z =A and 
suppose that A is generated by a,, a,,---,a,,. We prove by induction that there is a 
z’e€M —N such that a,z’ = 0 for i=1,2,---,n and N: z’=A. For n=1 we have 
a,zEN, so a,z=a,w for some weEN since N is pure. Thus a,(z — w) = 0 and 
z—wéeM—N, and N:(z—w)>N:z, so N:(2—w)=N:z=A. Also a,(z — w) 
= 0. Suppose true for n <m. Let w be such that N: w= Aandaw =Oforl Si<n. 
Then Ra,,,w is a pure submodule of M and a,.,,weRa,+1W, SO Q,44W=Ayn44 
(ra,+,w) for some re R. Let z’ = w — ra, ,w. Then a,z’ = 0 for i=1,2,---,n + 1, 
z’E€M—N and N: z’ =A. So we have that this is true for m and there is some 
wéM — N suchthat N: w= Aand A: w = 0. But then R-w ON = 0, acontradiction 
to an earlier observation. 


Also solved by Jim Brewer & Phil Montgomery & Ed Rutter, S. H. Cox, D. Z. Djokovi¢é, Bruce 
Ferrero, E. R. Gentile (Argentina), Robert Gilmer, A. A. Jagers (Netherlands), Daniel Opitz, 
T. G. Parker, and the proposer. 


Irreducibles in Integral Domains 


5863 [1972, 667]. Proposed by P. R. Chernoff, University of California, 
Berkeley 


Let D be an integral domain with infinitely many elements. Assume that every 
non-unit in D has an irreducible factor. 
Prove that D has infinitely many irreducibles or infinitely many units. 


Solution by Bruce Ferrero, Princeton University. If D has no irreducibles, then 
every non-zero element must be a unit, and D is an infinite field. If D has an irreducible, 
but only finitely many, let P be their product. For each n 2 1, let A, = P" + 1. Then 
A, must be a unit, since any irreducible divides P. If D also has only finitely many 
units, then A, = A,, for some n > m, and hence P"~” = 1, which is absurd. 


Also solved by James Alonso, D. D. Anderson, J. T. Arnold, Jim Brewer & Phil Montgomery & 
Ed Rutter, P. K. Garlick, S. N. Gersten, Robert Gilmer, G. S. Glazer, A. A. Jagers (Netherlands), 
Wells Johnson, L. E. Mattics, Barbara Osofsky, J. R. Smith, Hwa Tang, W. C. Waterhouse, and the 
proposer. 


On the Parts in a Partition 


5864 [1972, 668]. Proposed by G. E. Andrews, Pennsylvania State University 


Let P,, denote the set of partitions of n into positive integers. For each ze P,,, let 
d(x) denote the number of different parts of z, and let (x) denote the total number 
of parts of z. Prove that for n 2 1, 


1070 ADVANCED PROBLEMS AND SOLUTIONS [November 


2(— 1)" if n is a square, 
x (- 1)* (gum) — J 
nEPp 0 otherwise. 


Solution by Harry Lass, Jet Propulsion Laboratory, California Institute of 
Technology. Letn=n, = Lij=, ka, be a representation of n into positive integers, 
and define U(0) = 0, U(a;) = 1 for «; a positive integer. The number of distinct parts 
of n, is d(x) = %,"., U(a,), and the total number of parts of n, is #(x) = LiL 10). 

We note that 
(- 1) 200%) ska _ 1 — s 

1+ s¥ 


LMs 


[s| <1. 
a 


If we let 
fn) = X (- 172%, f (0) = 9, 


with z ranging over all partitions of n into positive integers, then 
10.) 
F(s) = & f(n)s" 
=0 
00 


t-s' <= 2 2r—1\2 
Ul ier ll (1 —s*)(1 —s*"*)’. 


A theorem of Jacobi states that 


I] G-s*) 48742457 42-4) = 2 Ss” 
r=1 —-~o 


for | s | <1,z+0. Setting z = — 1 yields 


F(s) = 5 (— 1)"s” =1+2 5 (— 1)"s"*, 


— 00 n=1 
so that f(n) = 2(— 1)" if n is a square, f(n) = 0 otherwise. 


Also solved by G. E. Andrews, L. Carlitz, N. J. Fine, M. G. Greening (Australia), L. E. Mattics, 
M. R. Modak (India), David Newman, Allen Stenger, Phil Tracy, and by the proposer. 


Editorial Note. G. E. Andrews points out that the formula may be found in a paper by J.W.L. 
Glaisher, Proc. London Royal Soc., 24(1875-6) Formula ITI, p. 252. 


A Non-Archimedean Vector Lattice 
5868 [ 1972, 780]. Proposed by B. C. Anderson, Henry Ford Community College 


Show that the following theorem becomes false if ‘‘Archimedean’’ is omitted. 
R" is an Archimedean vector lattice with respect to the order generated by a cone K 
if and only if there are n linearly independent vectors v“ such that K = {x =(x,)eR": 
Le, xu >0;k =1,2,---,n}. (Note: the word ‘‘Archimedean”’ is inadvertently 
omitted on p. 9 of A.L. Peressini, Ordered Topological Vector Spaces.) 


1973] ADVANCED PROBLEMS AND SOLUTIONS 1071 


Solution by J. T. Annulis, University of Arkansas, Monticello. Consider R* with 
the lexicographic order, i.e., with positive cone K = {(x,y):x>0 or x=0 and 
y = 0}. It is easily verified that (R?, K) is a vector lattice which is not Archimedean. 
Suppose there exists v“? = (v§”, oS?) and v® = (v'”, 0%) such that 


K = {x =(x,)e R?: xyv{? + x09 = 0 for i= 1,23. 


Then, since (1,0) and (0,1) are elements of K, we have v$? >0 for j= 1,2 and 
i = 1,2. By linear independence both v$’) and vS?) cannot be zero. We may then 
assume that v$)> 0. Let x, = — 2/v$"” Then if x = (x,,x,) where x, = 1 if of =0 
and x, = 1 /v{" if vt? > 0, we have (x,, x2) is an element of K. But x ,v{? + x,v$? <0, 
a contradiction to 


K = {(x4,X2): xyv{? + x,v08?>0 for i= 1,2}. 


Also solved by A. L. Peressini, G. C. Schmidt, and by the proposer. 


A Random Inscribed n-gon 


5869 [1972, 780]. Proposed by Anatol Rapoport, University of Toronto 


Let n points be chosen at random on the circumference of a unit circle. Show 
that the expected area of the inscribed n-gon is given by 


A(n) = x] 1 CE © yy 


Solution by J.G. Little and O. G. Ruehr, Michigan Technological University. 
After choosing the n points, we break the circle at one of them, calling it dy = 0. The 
remaining n — 1 points, in order, are distributed as the order statistics @,,---,6,_4 
of the uniform distribution on the interval (0,27). Letting ¢, = 2x, we find that the 
differences 0; = ¢; — ¢;-,, i= 1,--:,n have the common probability density function 


n—1 g \"-2 
£00) = ("He") (1-35) 
[Feller, William, An Introduction to Probability Theory and its Applications, 


New York (1966), pp 21-22. ] 
We triangulate the n-gon by drawing radii through the n points and chords 
through adjacent points. Since the area of the ith triangle is 4sin0;, we have 


A(n) = E (3 » sind, =4 2 K(sind,) 
i=1 i=1 
_ 2% n—2 
= mn) | sin 0(1 — =) dé. 
0 


Two successive partial integrations yield 


1072 REVIEWS [November 


A(n) = x|1 -|. sin 8 (1 — x) io}. 


Replace sin @ by its MacLaurin series and integrate term by term, employing the 
familiar beta function integral to obtain 
— 1))-*(2n)*/ 


_ > 
A(n) = x| 1 — nt 2 Qj xn! 


Note that the last expression is a (convergent) asymptotic expansion for n large. 


Also solved by Giinter Bach (Germany), A. A. Jagers (Netherlands), A. F. W. Jaégost (Nether- 
lands), Harry Lass, J. G. Mauldon, Thomas Spencer, Luis Verde-Star, and the proposer. 


Notes. Lass points out that lim,_,,,A(@”) = a. 


The proposer compares A (n) with A* (n), the designated area of the regular inscribed n-gon, and 
observes the following: 


A(n) ~ x1 — (22)? [(n + 1)(n + 2)), A*(n) ~ x1 — (27)? /6n7), 
A*(n) — A(n) ~ 1023 /3n?, 
A(n + 1) — A(n) ~ 62? /n? ~ 9(A*(n + 1) — A*(n)). 


REVIEWS 


EDITED BY J. ARTHUR SEEBACH, JR. AND LYNN A. STEEN 
with the assistance of the mathematics departments of St. Olaf and Carleton Colleges 
COLLABORATING EDITOR FOR FILMS: SEYMOUR SCHUSTER, CARLETON COLLEGE 


We invite readers to submit reviews of significant recent college-level mathematics books. 
We especially encourage reviews based on classroom use, or comparative reviews of several 
related books. Reviews should ordinarily not exceed two pages (per book) typed double spaced. 
Manuscripts of reviews as well as books submitted for review should be sent to: Book Review 
Editor, American Mathematical Monthly, St. Olaf College, Northfield, MN 58087. 


Introduction to Projective Geometry. By C. R. Wylie, Jr. McGraw-Hill, New York, 
1970. vii + 556 pp. $12.40. 

Projective Geometry and Algebraic Structures. By R. J. Mihalek. Academic Press, 
New York, 1972. xi + 220 pp. $9.75. (Telegraphic Review, August-September, 
1972.) 


The appearance of two well-written texts is welcome in any subject, but is doubly 
SO in projective geometry, a field in which there are relatively few books in print. 
Different parts of the subject are emphasized in these two books. The emphasis in 
Mihalek is on the axiomatic foundations of the subject and on the relation between 
the geometrical properties of the plane and the algebraic structure of the system which 
coordinatizes the plane. 


THE AMERICAN 


MATHEMATICAL MONTHLY 


(FOUNDED IN 1894 By BENJAMIN F. FINKEL) 
THE OFFICIAL JOURNAL OF 


THE MATHEMATICAL ASSOCIATION OF AMERICA 


VOLUME 80 NUMBER 10 
CODEN: AMMYAE 
CONTENTS 
Number Fifty-two . 2. 1 ween HARLEY FLANDERS 1099 
Prime Numbers and Brownian Motion . . . . . =. . ~~ PATRICK BILLINGSLEY 1099 
Correction to ““Unique Factorization Domains’ . . . . . = . = . P.M.Coun 1115 
Correction to “‘A History of the Prime Number Theorem’ . . . L.J. GOLDSTEIN 1115 
MATHEMATICAL NOres 
Complements and Comments . . . . .  . ROBERT GILMER AND Davin ROSELLE 1116 
A Problem on Series . . . . . . . ee GLI. OL JAMESON I119 
RESEARCH PROBLEMS 
Monthly Research Problems, 1969-73, . . . . 2... CURL KK. Guy 1120 
CLASSROOM NOTES 
The Minimal Polynomial of a Linear Transformation. . . . . M.D. Burrow 1129 
Another Proof of the Rational Decomposition Theorem . . . . .H.G.Jacos 1131 
A Proof of the Chain Rule for Derivatives in n-space . . . . . A.G.FADELL 1134 
MATHEMATICAL EDUCATION 
Survival of the Two-year College Mathematics Teacher . . . .P.A.LINDSTROM 1135 
ELEMENTARY PROBLEMS AND SOLUTIONS 1138 
ADVANCED PROBLEMS AND SOLUTIONS 1146 
REVIEWS 1152 
(Continued on inside cover) 
1973 


DECEMBER 


News AND NOTICES . . .  .. ee eee eee ee «163 


MATHEMATICAL ASSOCIATION OF AMERICA . . . . . «ett eti<(CStsté< sté‘«‘CO0C 
The Fifty-fourth Summer Meeting of the Association. . . . . . . . . . 1164 
Report of the Treasurer for the Year 1972 . . . a re © T.)0) 
Honorary Life Membership for Professor Emory P. Starke rs © C.3| 
April Meeting of the Maryland-District of Columbia-Virginia Section . . . . . 1182 
June Meeting of the Northeastern Section . . ns © 5) 
Mathematical Sciences Employment Register — Open Register . . . . . . . 1183 
Acknowledgement . . . . . eee ee ee ee «184 
Calendars of Future Meetings. . . . . . . eee eee «8 

INDEX «ww ee eee ee N87 


NOTICE TO AUTHORS 


Specialized research is usually unsuitable; see Statement of Policy (vol. 76, p.2). Manuscript preparation: Please 
use the Manual for Monthly Authors (vol. 78, p. 1) and follow the format in current issues of the MONTHLY. 
Manuscripts should be typewritten, triple-spaced with wide margins; submit two copies and keep one for 


protection against loss. 
Backlog: Main Articles 12 months, Math. Notes 13 months, Research Problems 7 months, Classroom Notes 


11 months, Math. Education 10 months. 


EDITORIAL CORRESPONDENCE AND MAIN ARTICLES: to ALEX ROSENBERG, Department of Mathe- 
matics, Cornell University, Ithaca, N.Y. 14850; NOTES, etc.: to the corresponding Associate Editor; 
ADVERTISING CORRESPONDENCE: to RAOUL HAILPERN, Mathematical Association of America, 
SUNY at Buffalo, Buffalo, N. Y. 14214; CHANGE OF ADDRESS and SUBSCRIPTIONS: to A. B. 
WILLCOx, Mathematical Association of America, 1225 Connecticut Ave., N.W., Washington, D.C. 20036. 


HARI.BEY FLANDERS, Editor 
ALEX ROSENBERG, Editor- Elect 
ASSOCIATE EDITORS 


JOSHUA BARLAZ J. G. HARVEY SEYMOUR SCHUSTER 
E.R. BERLEKAMP ERIC S. LANGFORD J. ARTHUR SEEBACH, Jr. 
JANE W. DI PAOLA P. D. LAX E. P. STARKE 

ROBERT GILMER ARTHUR MATTUCK LYNN A. STEEN 
RICHARD GUY M. W. POWNALL JAMES WENDEL 

RAOUL HAILPERN GIAN-CARLO ROTA 


Annual dues for members of the Association (including a subscription to the American 
Mathematical Monthly) are $12.50. For nonmembers the subscription price is $18.00. 


PUBLISHED BY THE ASSOCIATION at Washington, D. C., and Menasha, Wisconsin, during the months of January, 
February, March, April, May, June—July, August-September, October, November, December. 


Second-class postage paid at Washington, D. C., and additional mailing offices. 
Copyright © The Mathematical Association of America (Incorporated), 1973 


PRINTED IN THE UNITED STATES OF AMERICA 


NUMBER FIFTY-TWO 


The retiring editorial staff nursed from cradle through publication ten MONTHLYS 
per year for the last five years, also two Slaught papers. We hope most have been 
satisfactory, some excellent. Our debt to our authors and referees is great, also to 
our readers for their steady support and encouragement. 

I personally am grateful to all of my associate and collaborating editors. They 
worked hard and competently to a high professional standard. They join me in 


wishing our successors well. 
Harley Flanders 


PRIME NUMBERS AND BROWNIAN MOTION 
PATRICK BILLINGSLEY, The University of Chicago 


Because it factors into a product of prime numbers, each integer contains within 
it a kind of Brownian motion path, and the mathematics of Brownian motion can 
be used to derive theorems about the factorization. Despite the persistent notion that 
a result stated in probability language is rather less true than it might otherwise be, 
I shall state these theorems in probability language and even give them probabilistic 
proofs. As a matter of fact, there will be little in the way of real proofs, since for the 
most part I shall only illustrate general results by examples and special cases. For 
this there is the authority of William Feller, who used to tell us, his students, that the 
best in mathematics, as in art, letters, and all else —that the best consists of the 
general embodied in the concrete. Although at first I thought that was simply an 
antimilitary sentiment, I did eventually understand it as the intellectual-esthetic 
principle he intended and have tried ever since to keep it at the front of my mind. 

The paper has three sections. In Section 1, the mathematical model for a particle 
in Brownian motion is defined and some of its properties described. Section 2, which 
provides the link between Brownian motion and primes, concerns random walk: one 
successively tosses a coin and successively moves along a scale, one unit in the positive 


Patrick Billingsley received his Princeton Ph.D under William Feller. Except for a period of Navy 
service, he has been at the University of Chicago, where he is presently Professor of Mathematics and 
Statistics. He was a Fulbright lecturer for a year at the University of Copenhagen, and held a Guggen- 
heim fellowship at Cambridge University. Professor Billingsley’s main research interest is probability 
theory, and he is a fellow of the Institute of Mathematical Statistics. His publications include the 
books: Statistical Inference for Markov Processes, 1961, Ergodic Theory and Information, 1965, 
Convergence of Probability Measures,\968, Elements of Statistical Inference (with D. L. Huntsberger), 
1973. Editor. 


1099 


1100 PATRICK BILLINGSLEY [December 


or negative direction, according as the coin falls heads or tails. Here it is shown how 
a distant random walk looks approximately like a Brownian motion and how the 
Brownian motion model therefore leads to limit theorems associated with random 
walk. Section 3 discusses the random walk which a randomly chosen integer generates 
through its prime factorization: one successively examines the primes, 2, 3, 5, ---, and 
successively moves along a scale, one unit in the positive or negative direction, ac- 
cording as the prime appears in the factorization or not. It turns out that, because of 
the arithmetic fact that distinct primes individually divide an integer if and only if 
their product does, this factorization random walk has many of the properties of the 
ordinary coin-tossing random walk; in particular, it too can be approximated by 
Brownian motion, and it is shown how this leads to limit theorems associated with 
factorization into primes. 

In addition to the elements of real analysis, the paper makes use of statistical 
concepts such as mean, variance, independence, and Gaussian distribution. 


1. Brownian motion. Imagine suspended in a fluid a particle bombarded by 
molecules in thermal motion. The particle will perform that irregular and seemingly 
random movement first described by the biologist Robert Brown in 1828. Since we 
shall be concerned with just one component of this motion, imagine it projected on a 
vertical axis: At each instant t of time we note the height x(t) of the particle above a 
fixed horizontal plane. Over T units of time, the motion of the particle, which we 
take to start at 0, is described by the positions x(t) for 0 < t < T—that is, by a 
continuous real function x on [0,7] with x(0) = 0. This leads us to consider the 
collection C,[0, 7] of such functions x. 


Position 


FIGURE } 


T Time 


* For technical reasons, we make C,[0, T] into a metric space by taking the distance 
between two of its elements to be the maximum vertical distance between their 
graphs. This topology, the uniform topology, is of little direct concern here; it is 
brought in mostly as evidence that the discussion to follow does have a rigorous 
basis. 

The random motion of the particle is described by an assignment of probabilities 


1973] PRIME NUMBERS AND BROWNIAN MOTION 1101 


P7(A) to subsets A of Co[0, T]; P(A) represents the chance that the path traced out 
by the particle lies in A, or is described by a function x that les in A. Probabilities 
represent long-run relative frequencies. If the total on a pair of dice is observed, the 
possible outcomes are 2, 3,---,12. If many pairs of balanced dice are rolled inde- 
pendently, the proportion among them producing the outcome 7 will be about 
1 /6. If a particle in Brownian motion is observed for T units of time, the possible 
outcomes are the various elements of C,[0,7T]. If many independently moving 
particles are observed, the proportion among them producing paths that lie in A will 
be about P,(A). Although the interpretation of probability involves such multiple 
observations, in the mathematical theory we speak of a single roll of the dice, the 
probability the roll produces a 7 being 1/6; in the same way, we speak of a single 
particle, the probability it traces out a path that lies in A being P,(A). 


FIGURE 2 


The set [x: « < x(t) S B], consisting of the paths that go through the gate in 
Figure 2, represents the event that at time t the particle will lie between « and f; it is 
assigned probability 


(1) Pr[x:asx() Sp] = 2 | erat, 


/2nt a 


Thus the distribution of the position at time t follows the Gaussian curve with mean 0 
and variance t. That the mean is 0 reflects the fact that the particle is as likely to go 
up as to go down; there is no drift. The variance t grows linearly; this indicates that 
the particle tends to wander away from its starting point and, having done so, 
suffers no force tending to restore it to that starting point. The equation (1) can be 
extended: the increment over [s,f] has a Gaussian distribution with mean 0 and 
variance t — s. 

The other important property of Brownian motion is this: Suppose s < s’<t<t’, 
and consider for example the event A =[x: x(s’) — x(s) =3] that the particle 
undergoes an upward displacement of at least 3 units during the time interval [s, s’], 
together with the event B=[x: x(t’) —x(t)<0] that the particle undergoes a 


1102 PATRICK BILLINGSLEY {December 


FIGURE 3 


downward displacement during the time interval [t, t’]. The top path in Figure 3 lies 
in A but not in B, and the bottom path lies both in A and in B. The probabilities of 
A and B and of their intersection A  B are related by 


(2) P(A OB) = Py(A)P7(B). 


Thus A and B satisfy the definition of independence; that is, that the displacement the 
particle undergoes during [s,s’] in no way influences the displacement it undergoes 
during [t, t’]. This implies a kind of lack of memory. Although the future behavior 
of the particle depends on its present position, it does not depend on how the particle 
got there. Equation (2) has a more general form showing that the increments over any 
number of disjoint intervals are statistically independent of one another. 

The equations (1) and (2), together with generalized versions of them, determine 
all the probabilities P;(A). (This ignores a technical point: P;(A) cannot be defined 
for every subset A of Co[0, T], but it can for every Borel set A — that is, for every A 
in the o-field generated by the sets open in the uniform topology.) It was one of 
Norbert Wiener’s achievements to prove in 1923 that there does exist an assignment 
of probabilities satisfying these rules, and P; (the corresponding measure on the 
Borel sets) is accordingly called Wiener measure. Here we shall take its existence for 
granted. 

Brownian motion, as described by Wiener measure, obeys a transformation law 
having consequences strange and deep. Suppose that a particle performs a Brownian 
motion for T units of time, and suppose that, in the function representing its path, 
we contract the time scale by the factor T and the position scale by the factor JT. 
According to the law in question, the new path will be exactly like that of a particle 
that has been in Brownian motion for 1 unit of time. 

To understand why, let x and y be the old and new paths, so that x lies in 
C,[0, T], y lies in Co[0,1], and 


(3) y)=——x(1T), O<tS1. 
T 


1973] PRIME NUMBERS AND BROWNIAN MOTION 1103 


Of course (3) defines a mapping 


The transformation law says that, if x is a random path in C,[0, T] distributed 
according to P;, then y is a random path in C,[0,1] distributed according to P,. 
(Technically, if @; is the mapping (4), then P, = P;¢;*.) Now, according to (1), 
the quantity x(tT) is a Gaussian random variable with mean 0 and variance tT. 
Multiplying a Gaussian random variable by a constant a multiplies its mean by a 
and its variance by a”, and the new variable is also Gaussian. Therefore the dis- 
tribution of y(t) as defined by (3) follows the Gaussian curve with mean 0 and 
variance t (since T-* -0 =0, (T~*)? - tT = 0), the first requirement for Brownian 
motion. Contracting time by the factor T leads from a path over [0, T] to a path 
over [0,1], and the vertical rescaling by 1/./ T makes the variances work out right. 
Moreover, x has (over disjoint intervals) independent increments, and it is intuitively 
clear that monotone changes of the time and position scales cannot convert in- 
dependent increments into dependent ones. So the transformation (3) must preserve 
the other property of Brownian motion, that of independent increments. This 
argument, which makes the transformation law plausible, can be converted into a 
complete proof. 

By means of the transformation defined by (3), it is possible to see that, whatever 
positive values ¢ and K may have, a Brownian path over [0,1] will with probability 
exceeding | — e have somewhere a chord with slope exceeding K. The trick is this: 
We want, over [0, 1], a Brownian path y with a steep chord. We obtain it not directly, 
but by applying the transformation (3) to a Brownian path x over [0, T] with T 
suitably chosen. Choose T so large that x will, with probability exceeding 1 — ¢, 
have somewhere a chord with slope exceeding, say, 1. Such a T exists because even 
the most miraculous event will happen in the long run (the monkeys at the type- 
writers), and the occurrence of a chord with slope exceeding | is a modest miracle 
indeed. At the same time, choose T to exceed K~. If x has a chord with slope ex- 
ceeding 1, and if x and y are related by (3), then y has a chord with slope exceeding 
./T, which in turn exceeds K. 

Since ¢ may be taken arbitrarily small and K arbitrarily large, a Brownian path 
over [0,1] must with probability 1 have chords with arbitrarily great slope. There 
must also be chords with arbitrarily large negative slope, and in fact, chords (very 
short ones) with extreme slopes are dense along the path. In rigorous and more 
elaborate form, these arguments show that, if A is the set of paths in C,[0,1] of 
unbounded variation,{then P,(A) = 1. A path of unbounded variation represents the 
motion of a particle that in its wanderings back and forth travels an infinite distance, 
and at this point physicists lose interest because of their obsession with reality. The 
fact is mathematically interesting, however, and so is the fact thatyP,(A) = 1 if A is 
the set of functions in C,[0,1] that are nowhere differentiable. Constructing a 


1104 PATRICK BILLINGSLEY [December 


continuous, nowhere differentiable function 1s difficult, but drawing an element from 
C,[0,1] randomly according to P, produces such a function with probability 1. 

In what follows we shall be mainly concerned with sets that correspond more 
closely with reality. Although Sections 2 and 3 will involve the transformation (3) 
and T’s that exceed 1, for the rest of this section we shall take T = 1. We shall need 
(1) for the case T=t=1: 


(5) P,[x:«<5 x1) sp] = = {. el? dy, 
J2n J 


Suppose « = 0 and consider the event [x: max x(t) 2 a] that the particle achieves 
the height « at some time t with O<t <1. First, 


P,[x: max x(t) 2 «] = P,[x: maxx(t) 2a and x(1) 2a] 
+ P,[x: max x(t) 2a and x(1) <a]. 


The two probabilities on the right here can be proved equal, roughly because once the 
particle achieves the height « it is as likely, in the absence of drift, to wander upward 
and finish above « at time | as to wander downward and finish below a. Thus 


P,[x: max x(t) 2 a0] = 2P,[x: max x(t) 2a and x(1) 2 «]. 


Since the condition max x(t) = « is superfluous in the presence of the condition 
x(1) 2 a, the right side here is 2P,[x: x(1) 2 «], and (5) with « = 0 and Bp = co now 
implies 


a) re,0) 
(6) P,[x: maxx(t)2a]= -=- eo du, 


J2n J 


Thus we have the distribution of the greatest positive excursion. 


FIGURE 4 


0 


wit 


Lt: x(t) >0] 


Although to make it rigorous requires some effort, this derivation of (6) has an 
intuitive appeal. The next result will be stated without any proof, and like many 


1973] PRIME NUMBERS AND BROWNIAN MOTION 1105 


ex cathedra assertions, it runs counter to intuition. Consider the set [t: x(t) > 0] of 
time points t, 0<t< 1, for which the particle is above 0. This set is a union of 
intervals (infinitely many, contrary to Figure 4). Denote by bars the Lebesgue measure 
of this set, the sum of the lengths of the constituent intervals: | Ce: x(t) > 0] |. The 
distribution of this quantity, the total time spent above 0, is given by 


1 [* du 
(7) Pi[vias|[e x > 0] s6]=— | rer 
a Jul—u 


for OS a< f <1. This is Paul Lévy’s arc sine law, so called because carrying out 
ihe integration leads to the arc sine function. 


FIGURE 5 


Figure 5 shows the shape of the density, the area of the shaded region representing 
the right side of (7). The curve is U-shaped, so that if the length B — a of the interval 
is fixed, the probability of « < | [t: x(t) > 0] | < B grows as the interval nears 0 or 1, 
being smallest when the interval is centered on 4. This is odd because the time spent 
above 0 has mean 4 by symmetry, and ordinarily values near the mean of a random 
quantity are more likely to occur than are values far removed from the mean, whereas 
here the situation is just the opposite. 

For general accounts of Brownian motion, see [4] and [7]. 


2. Random walk. Imagine a particle moving about at random on the nodes of a 
cubic lattice. The particle can move in any of six directions (north, south, east, 
west, up, down) to an adjacent node. The direction is determined by the roll of a 
balanced die, the particle moves to the next node, and the die is rolled once more 
to determine the direction of the next move, and so on. Figure 6 shows five steps of 


FIGURE 6 w, 


such a random walk, together with one of the cells of the cubic lattice. The figure is in 


1106 PATRICK BILLINGSLEY [December 


the spirit of a venerable vector analysis book which began a proof of Gauss’s theorem 
by enjoining the reader to consider “‘an infinitesimal element of volume of dimensions 
dx, dy, and dz.’’ This injunction was accompanied by a nicely labelled diagram 
like Figure 7, which was said to show such an infinitesimal element of volume ‘‘much 
enlarged.’’ Well, Figure 6 is much enlarged too, and if the cubes of the lattice are 
really very small and the particle moves very rapidly from node to node it is natural 
to expect the motion to approximate Brownian motion. 


JV / 


FIGURE 7 | on 
dy y, 


We shall explore a one-dimensional version of this idea. Consider a vertical axis 
with the integer points 0, + 1, + 2,--- marked off on it. We start at 0, toss a coin, 
and move upward one unit if the coin falls heads and downward one unit if the coin 
falls tails. In the new position (+ 1 or —1), we toss the coin again and move up or 
down one unit according as it falls heads or tails, and we continue this way for T 
steps, T being here an integer. If we take one unit of time to execute each step of this 
random walk and proceed at a uniform rate from one node to the next, our progress 
is described by a function like that in Figure 8, a polygonal path whose height over i 
is the position at i— that is, the position after the ith step. Of the 27 such paths, each 
has probability 2~7. (Various aspects of random walk are discussed in [3].) 


position at i 


FIGURE 8 


The path can also be viewed as describing the fluctuations in a gambler’s fortune. 
The position on the vertical axis represents the gambler’s fortune (relative to his 
initial capital, so that he starts conventionally at 0), and it moves up or down one 
unit — say one pound — according as he wins or loses the next play. 

The random walk path has some of the properties of a Brownian motion path 
over [0,7]. In the first place, for integers with i <i’ <j <j’, the displacements 
undergone over the time intervals [i,i’] and [j,j’] are independent because they 
depend on disjoint sets of tosses and the tosses are assumed independent (the coin 
has no memory). Thus the path has essentially independent increments (for intervals 


1973] PRIME NUMBERS AND BROWNIAN MOTION 1107 


with nonintegral endpoints the increments can be slightly dependent). The distance 
moved in one step has mean 


(8) (+ 13 +(—-1)4=0 
and variance 
(9) (+ 1)*44+(-1)*4 = 


and so the position at i has mean 0 and by independence has variance i, another 
property of Brownian motion (see equation (1)). (For nonintegral t, the position at t 
has mean 0, but the variance is only approximately t.) Although the polygonal 


position at i 


FIGURE 9 +1//T 


character of of the path is not shared by Brownian motion, contraction of the two 
scales will make the straight-line segments in Figure 8 disappear in the limit as T—> oo. 

Suppose we contract the time scale by a factor T and the vertical scale by a factor 
J T, applying the transformation (3) to pass from Figure 8 to Figure 9. In Figure 8 
the segments have length ,/ 2, whereas in Figure 9 they are very short for large T, 
having length of the order 1 /,/ T. If Figure 8 represented a Brownian motion path 
over [0,7], then, as explained in Section 1, Figure 9 would represent a Brownian 
motion path over [0, 1]. The transformation (3) leaves invariant those characteristics 
(means, variances, independence of increments) the original path shares with Brownian 
motion and tends to mask those characteristics (piecewise linearity) it does not 
share. Thus we can hope that the curve in Figure 9 will be very like a Brownian 
motion path for large T. And indeed, it is true that 


(10) « Prob[pathe A] > P,(A) | (T 00) 


for subsets A of the space C,[0,1], where P,(A) is Wiener measure. There are 27 
paths like the one in Figure 9, and Prob [path e A] is 2-7 times the number of them 
that lie in A. 

For an illustration of this theorem, suppose the A in (10) is the set [x: «eS x(1) Sf] 
of paths in C,[0,1] that over the point t = 1 have a height between « and f. Since 
the height over t = 1 in Figure 9 is 1/./T times the position at T in the original 


1108 PATRICK BILLINGSLEY [December 


random walk, (10) and (5) together imply 


(11) Prob E < position at qT < 6| > — [. el? dy, 
JT J2n Je 

This is the classical DeMoivre-Laplace central limit theorem for Bernoulli trials. It 

describes the position after a large number of steps in a random walk, or the gambler’s 

fortune at the end of an evening’s play of T ventures. If — a= PB = .9, the limit in 

(11) is about .6. If T = 100, the gambler thus has probability approximately .6 of 

ending the evening within .9 x ,/100 = 9 pounds of his initial capital. 

Suppose now that A is the set in (6), the set of paths in C,[0, 1] having somewhere 
a height at least « (here « = 0). The path in Figure 9 lies in Aif at some time during the 
evening’s play the gambler’s fortune is at least « J/ T pounds above his initial capital, 
and by (10), the probability of this converges to the right side of (6). For « = 1.7, the 
value of this limit is about .1. With T= 100, this gives an approximate probability of 
.1 that the gambler will have been at least 1.7 x ,/100 = 17 pounds ahead at the 
time he should have quit. 

Finally, suppose A is the set [x:aS | [t: x(t) > 0] | < B| in (7). During the 
evening the gambler is ahead a certain fraction of the time; if the curve in Figure 9 
represents the history of his fortunes, it belongs to the set A if and only if this fraction 
lies between a and fp. The chance of this event is by (10) and (7) about equal to the 
area of the shaded region in Figure 5. If we compute the areas, the chance the gambler 
is ahead between 45% and 55% of the time turns out to be only about .06, whereas 
the chance he is ahead more than 90% of the time is about .2. In one evening in five 
the gambler will thus be ahead more than 90% of that evening’s play. By symmetry, in 
one evening in five the gambler will be ahead less than 10% of that evening’s play. 
To convince him in the first [second] case that his experience is due merely to chance 
and not to his being Fortune’s favorite [Fortune’s fool] will be difficult [impossible |. 

We have applied (10) to three interesting sets A. If A is the set of functions in 
C,[0,1] of unbounded variation, then P,(A) = 1, as explained in Section 1, while 
Prob|[ path e A] = 0 because the curve in Figure 9 is visibly of bounded variation. 
Thus (10) fails for certain subsets A of Co[0,1]. The mathematical fact is that (10) 
holds for every set (Borel set) A whose boundary 0A (boundary in the sense of the 
uniform topology) satisfies P,(0A)=0—a condition which holds in our three ap- 
plications but not if A is the set of functions of unbounded variation. A complete 
proof of this theorem uses a combination of probability theory and functional 
analysis; the details can be found in [1]. 


3. Prime divisors. According to the fundamental theorem of arithmetic, each 
integer has a factorization into primes, a factorization unique except for order (see 
[5], for example). Let f(n) be the number of distinct primes in the factorization of n; 
we do not count multiplicity: f(3*-57) is 2, not 6. The table shows some values of the 


1973] PRIME NUMBERS AND BROWNIAN MOTION 1109 


n 2 3 4 5 6 7 vss = 29 300s 31 “> 209 210 211 


f(n) 1 1 1 1 2 1 nee 1 3 1 vee 2 4 1 


function f. It rises slowly. The smallest n’s with respective f-values 2, 3, and 4 are 
2°:3=6, 2:3:5= 30, and 2:3-5:-7=210. The fact that there are infinitely 
many primes implies, however, that f assumes arbitrarily large values; since f(p) = 1 
for prime p, the same fact implies that f infinitely often drops back to 1. 
Since f varies in this irregular fashion, it is natural to ask after its average behavior. 

For example, it can be shown that 

1 N 
(12) —  f(n) x loglogN 

N n=1 
(see the remarks following (17) below). Since log log 10’° ~ 5, the typical integer 
under 10’° has a mere five prime divisors. More delicate questions concern the 
distribution of f. If S is a set of positive integers, let P,(S) be the fraction among 
the integers 1, 2,---,N that lie in S: 


(13) P,(S) = x #[n:1SnSN and neS}. 


The problem is to get information about quantities like P,[n: a <f(n) S b]. 

Now (13) can be viewed as a probability: We draw an integer at random from the 
range 1< n<N, andP,,(S)is the probability that it willliein S. That Py[n:asf(n) Sb] 
can be viewed as a probability does not by itself ensure (this may be difficult to credit) 
that probability theory will help in the evaluation. It does in fact help because the 
notion of independence can be brought to bear. If 5,(n) is 1 or 0 according as the 
prime p divides n or not, then f(n) = &,6,(n). We can understand the distribution 
of f(n) if we understand the joint behavior of the 5,(n) as random quantities. 

The number of multiples of p up to N is the integral part [N/p] of N/p. The 
probability that 6,(n) = 1, or that p | n, 1s thus 


(14) Py{n: p|n] = | we. 


The approximation here is good for large N: since [N/p] differs from N /p by less 
than 1, the error in (14) is less than 1/N. The formula (14) reflects the fact that p 
divides every pth integer, and it in no way requires that p be prime. 

The fundamental theorem of arithmetic implies that, if integers a and b are 
relatively prime (share no prime factors), then they individually divide n if and only if 
their product ab divides n. This fact is well illustrated by the use Turing is said to 
have made of it. The sprocket wheel of his bicycle had a faulty tooth and the chain a 
faulty link, and unless he was pedalling very fast when the faulty parts meshed, the 
chain would fall off. So he counted the number, say a, of teeth on the wheel and the 


1110 PATRICK BILLINGSLEY [December 


number, say b, of links on the chain and found, not to his surprise, that a and b were 
relatively prime. Between successive meetings of the bad tooth and link the sprocket 
wheel would in consequence go through b cycles, as the chain went througha cycles. 
Turing is said to have pedalled along counting, on every bth cycle of the sprocket 
wheel giving the burst of speed necessary to carry him past the danger point. 

As a special case of this fact, distinct primes p and q individually divide n if and 
only if pq does. By this and by (14) with pq in place of p, 


i 


1 [TN 1 1 
Py[n: pjn and q|n] = Py|n: -xylal*a-o'¢ 
Since by (14) the factors 1/p and 1/q respectively approximate P,|[n: p|n] and 


Py[n: 4 | n] if N is large, we arrive at 
(15) P,[n: p|n and q|n] ~ P,[n: p|n] Py[n: q|n]. 


Thus the events [n: p | n| and [n:q | n| approximately satisfy the definition of 
independence if n is random, 1 <n <N, with N large. There is an extension of (15) 
from two primes to three or more. 

We can use this fact to construct a kind of random walk path containing informa- 
tion about the prime factorization of n and in particular about f(n). We draw an 
integer n at random from among 1, 2,---, N. On a vertical axis with the integer points 
marked off on it, we start at 0 and go up one unit if 2 | n and down one unit if 2)n. 
From our new position (+ 1 or —1), we go up one unit if 3 | n and down one unit if 
3 4n. We proceed in this way, examining each prime in succession. Figure 10 
describes this factorization random walk in the same way that Figure 8 describes the 
coin-tossing random walk. Each number on the time axis is the prime corresponding 
to that step in the random walk. We consider later how long to continue the walk. 


FIGURE 10 


Since n is random, this path is random. But since the randomness is all in the 
drawing of n before the walk starts, the factorization random walk may seem less 
random than the coin-tossing random walk. This is an illusion. We may imagine 
tossing the coin T times in advance of the walk, recording the sequence of heads and 
tails, and only then performing the corresponding walk. Since we wouid see its 
whole history on record before setting out, the walk would be very dull. So imagine a 
friend who tosses the coin T times and records the results in advance of the journey, 
and imagine that, rather than show us the record all at once, he instead reveals the 


1973] PRIME NUMBERS AND BROWNIAN MOTION 1111 


outcomes to us one by one as we execute the walk. This restores the suspense. For 
the factorization random walk, we can imagine a friend who draws n at random, 
1 <sn<vN, factors n into primes, and at each step of the walk reveals to us whether or 


not the corresponding p divides n. 
The increment of the random path in Figure 10 over an interval depends on how 


many in the corresponding set of primes divide n. Increments over disjoint intervals 
depend on disjoint sets of primes and hence by (15)—or by (15) together with its 
extension to three or more primes — the increments will be approximately independent 
if N is large. Unlike Brownian motion, however, the factorization random walk has 
a strong downward drift. By (14), the chance of going downward at the step cor- 
responding to p is about 1 — 1/p, which is almost 1 for large p. The remedy is to 
move up a distance 1 — 1/p if p | n and to move down only a distance 1/p if p/n. 
The expected distance moved is now 


(1 — p~*)Py[n: p|n] + (—p~")Pyfn: pyr], 
which by (14) is approximately 


(a) pela) lt p)-° 

P/ P p p 

This corresponds with (8), an equation which shows that the coin-tossing random 
walk has no drift. 


x 8 (n)- oe -|/ 
a2 ql") log log p I/p 


FIGURE 11 


Since the mean distance moved at the step corresponding to p is approximately 0, 
the variance is approximately (1 — p~*)*P,[n: p | n| +(— p)*P,|[n: p/n], which by 
(14) is in turn approximately 


a pA OBB aa 
p Pp p Dp Pp Pp p 

The distance moved thus tends to be very small for large p, in contrast with the 
coin-tossing random walk, which by (9) proceeds with vigor ever undiminished. 
The remedy this: time is to spend only an amount of time 1 /p executing the step 
corresponding to p. With these two modifications, the path is as in Figure 11. 

To recapitulate, the time interval corresponding to the prime p has length 1 /p. 
Over this interval, the path rises an amount 6,(n) — 1/p; that is, it rises 1 — 1/p if 
Pp | n (the probability of this is approximately 1/p) and it rises 0 — 1/p if p/n (the 
probability of this is approximately 1 — 1 /p). 


1112 PATRICK BILLINGSLEY [December 


The point t in Figure 11 (the right endpoint of the interval corresponding to p) is 
dX <p 1/q (summation over primes q not exceeding p). The distance moved in the 
step corresponding to q has variance about 1/q, and hence by the approximate 
independence of the steps ((15) again), the variance of the position at this time t is 
approximately X,.<, 1 /q, or t itself. The above adjustment of the factorization random 
walk has thus not only eliminated the drift, it has so adjusted the time scale that the 
variances are about what they are for the coin-tossing random walk and for Brownian 
motion. 

It can be shown that 
~ loglogu 


(16) » 4 
qsu 4 
for large u (the two expressions go to infinity with u and their difference remains 
bounded; see [5, p. 351]). That the sum in (16), instead of increasing in some erratic 
fashion, is asymptotic to a standard function like log log u is inessential to what 
follows; but the formulas become simpler (and remain valid) if at each occurrence of 
the sum we substitute the right side of (16). 
Thus the ¢t in Figure 11 is essentially log log p and the height 2, < ,(6,(n) — 1/q) 
of the curve over tf 1s approximately 


(17) x 6,(n) — log log p. 
q=P 


Now n has 2, <, 6,(n) prime divisors that do not exceed p, and we normalize this 
quantity by subtracting away the value log log p it has for a “‘typical’’ n. (If n is 
random, |! <n<N, then X,<, 6,(n) has by (14) and (16) a mean of about log log p; 
this is where (12) comes from.) The factorization random walk is a record of these 
differences (17). We continue the walk until each p < N has been dealt with, and the 
corresponding point on the time axis is T= Lp<y 1/p = loglogN. 


z 8, (n)- loglog p 
q<p 


Vloglog N 


FIGURE 12 


The random path now resembles a coin-tossing path in that the increments are 
almost independent for large N, there is essentially no drift, and the variances are 
about right. As in the coin-tossing case, rescaling will lead in the limit (VN > oo) to 
Brownian motion. To send T to the point 1, we contract the horizontal scale by a 
factor T = loglog N, and, again as in the coin-tossing case and for the same reasons, 
we contract the vertical scale by the square root of this, applying the transformation 


1973] PRIME NUMBERS AND BROWNIAN MOTION 1113 


(3). The point t in Figure 11 goes to log log p /log log N, and the path is that shown in 


Figure 12. 
Since the path depends on n and N, denote it path,(n). Since n is random 


(1 $n SN), so is the path, and the chance that it lies in a given subset A of C,[0, 1] is 
Py[n: path,(n) € A]. The theorem linking primes with Brownian motion is this: If A 
is a subset (Borel subset) of C,[0, 1] satisfying P,(¢A) = 0, then 


(18) P,[n: path;(n)e A] > P(A) (Noo), 


where P,(A) is Wiener measure. The proof of (18) uses a combination of probability 

theory, functional analysis, and number theory. The theorem is given implicitly in 

[8, p. 122], explicitly in a manuscript version of [1] and in a much more general 

form in [9]. (For general discussions of probability methods in number theory, see 

[6], [8] and the author’s 1973 Wald lectures, to appear in the Annals of Probability.) 
From Figure 12, a plot of the differences (17) normalized to 


(19) (xX 5,(n) — oglog p)/./loglog N , 


we can read off arithmetic properties of n, and therefore (18) yields arithmetic limit 
theorems. Consider the three sets A to which we applied the analogous result (10). 
The height of the curve in Figure 12 over the time point 1 is(19) with N in place of p; 
it is the number f(n) of prime factors of n, normalized to 


(f(n) — log log N)/,/ log log N. 


The greater this is, the more highly composite n is, and the smaller it is, the more 
‘“prime-like’’ n is. With A = [x: « S$ x(1) S £], it follows by (18) and (5) that 
_ p 
(20) P | m: as L(n) — loglogN’ < 6| a? — e!? du. 
/ loglog N /2n 


This is the Erdés-Kac central limit theorem for f. (For an elementary direct proof or 
(20), see [2].) 

For —«= f= .9, the limit in (20) is about .6, and if N = 107°, so that log 
log N = 5, the double inequality in (20) is approximately the same as —.9 <(f(n) 
— 5)/,/5<.9, which in turn is approximately the same as 3 S$ f(n) <7. Thus 
something like 60% of the integers under 107° have from 3 to 7 prime divisors. 

The larger (17) is, the more highly composite n appears to be at that point in the 
factorization; that is, (17) measures the apparent compositeness of n when it has been 
tested for divisibility only by primes up to p. The maximum apparent compositeness 
is measured by 


(21) max (= 5,(n) — loglog p); 


DSN 


1114 PATRICK BILLINGSLEY [December 


since this is Jloglog N times the maximum height of the curve in Figure 12, an 
application of (18) to the set in (6) gives its approximate distribution. The right side 
of (6) being about .1 if « = 1.7, for about 10% of the integers under 107° does (21) 
exceed 1.7 x 5% 3.8. 

Let us say that n 1s excessive at p if 


(22) X 6,(n) > loglog p; 


qQ=p 


this holds if, with respect to divisibility by primes up to p, n is “‘more composite,”’ or 
‘‘less prime-like,’’ than the average integer. And (22) holds exactly when the cor- 
responding point on the curve in Figure 12 is above the axis. The polygonal segment 
corresponding to p has length p~*/log log N when projected on the horizontal axis, 
and so the amount of time the curve spends above 0 is essentially 


(23) iaciogN do - psWN and a 6,(n) > loglog |, 
the sum extending over those p at which n is excessive. 

If we test n for divisibility by the primes in succession, spending an amount 
1 /p of time on p (p S N), (23) is the fraction of time we are dealing with a p at which 
n is excessive. From an application of (18) to the set in (7) it follows that for large N 
the distribution of (23) approximately follows the density curve in Figure 5. For 
about 20% of the integers under N the quantity (23) exceeds .9, for about 20% it is 
less than .1, and for only about 6 % does it lie between .45 and .55. 

Prime factors exhibit in this respect the same strange behavior coins do. In a way 
they are even more strange. A quantity perhaps more natural to consider than (23) is 


(24) a x #[p: p< N and 2» 6,(n) > loglog pl, 

m™(.N ) qSp " 
the number of p for which n is excessive at p, normalized by division,by 2(N), the 
total number of primes involved. For N large, of the break points in the polygon in 
Figure 12 the great majority are very near 1, which has the result that in the limit the 
distribution of (24) consists of a mass of 4 at 0 and a mass of 4 at 1: If e>Oand N 
exceeds some N,, then (24) is less than ¢ with a probability lying in the range 4 — ¢ 
and 4 + « and is greater than 1 — ¢ with a probability lying in the same range. Thus 
practically all integers are excessive either at practically all primes or at practically 
none. 


The 1972 Rouse Ball Lecture, given while the author was a Guggenheim Fellow, visiting Peter- 
house and the Statistical Laboratory of the University of Cambridge. It appeared in somewhat 
different form in Eureka, the Journal of the Archimedeans, the Cambridge University Mathematical 
Society. 


1973] CORRECTIONS 1115 


References 


1. Patrick Billingsley, Convergence of Probability Measures, Wiley, New York, 1968. 
2. , On the central limit theorem for the prime divisor function, this MONTHLY, 76 (1969) 


132-139. 
3. William Feller, An Introduction to Probability Theory and Its Applications, vol. I, 3rd ed., 


Wiley, New York, 1968. 
4. David Freedman, Brownian Motion and Diffusion, Holden-Day, San Francisco, 1971. 
5. G.H. Hardy and E. M .Wright, An Introduction to the Theory of Numbers, 4thed., Clarendon 


Press, Oxford, 1960. y 
6. M. Kac, Statistical Independence in Probability, Analysis and Number Theory, Carus Math. 


Monogr. 12. MAA, Wiley, New York, 1959. 

7. Samuel Karlin, A First Course in Stochastic Processes, Academic Press, New York, 1966. 

8. J. Kubilius, Probabilistic Methods in the Theory of Numbers, 2nd ed. (1962). Vilna: Gosu- 
darstv. Izdat. Politi¢é. i Nauén. Lit. Litovsk. SSR. (English translation 1964. Amer. Math. Soc. 
Transl. of Math. Monographs, Volume 11.) 

9. Walter Philipp, Arithmetic functions and Brownian motion, Proc. Symp. Pure Math., vol. 


24, AMS, 1973. 


CORRECTION TO “UNIQUE FACTORIZATION DOMAINS?’’ 
P. M. Conn, Bedford College, University of London 


The statement “Any Noetherian UFD is a Dedekind domain” (this MONTHLY, 


80 (1973) 1-18) should be omitted. 
The assertion is of course well known to be false; acorrect statement would be: 


A Dedekind domain is a UFD if and only if it is a principal ideal domain. 
I am indebted to Professor J. H. Hays for drawing my attention to this error. 


CORRECTION TO “A HISTORY OF THE PRIME NUMBER THEOREM”? 
L. J. GOLDSTEIN, University of Maryland 


In my paper, | this MONTHLY, 80 (June-July, 1973) 599-615] I asserted that the 
sieve of Eratosthenes was known to the ancient Greeks and, in fact, appeared in 
Euclid. It has been pointed out to me by Professor J. Albree that although the sieve 
was known since approximately the time of Euclid, it does not appear in the Elements. 


The author regrets the error. 


MATHEMATICAL NOTES 
EDITED BY ROBERT GILMER 


Material for this Department should be sent to David Roselle, Department of Mathematics, 
Louisiana State University, Baton Rouge, LA 70808. 


COMPLEMENTS AND COMMENTS 
ROBERT GILMER AND DAVID ROSELLE 


This annual article is designed to provide our readers with an outlet for remarks 
on papers that have appeared in the notes sections of the MONTHLY. Because of the 
nature of the material we seek to publish, it is inherent that there will be some dupli- 
cation of results already in the literature; such duplication may not be undesirable, 
depending upon accessibility, presentation, etc. Under any circumstances, we are 
happy to have readers point out sources that are pertinent to articles we have publish- 
ed. During the past year, we have received the following information. 


Calculus. The article A unified proof of several basic theorems of real analysis 
by Patrick Shanahan (this MONTHLY, 79,(1972) 895-8) has brought response from 
several readers. Leonard Gillman calls attention to the notes by L. R. Ford (this 
MONTHLY, 64 (1957) 106-8) and by W. L. Duren, Jr. (this MONTHLY 64 (1957) part IL, 
19-22). B. J. Pettis also credits the idea to Professor Ford and points out that the 
basic lemma appears in the calculus text by Brady and Mansfield. Professor Ray 
Glenn has pointed out that the ideas here also appear in A creeping lemma by R. M. 
F. Ross and G. T. Roberts (this MONTHLY, 75 (1968) 649-52). Daniel Mosenkis writes 
that Professor Jesse Douglas utilized this approach in notes distributed to his 
advanced calculus classes at C.C.N.Y. in the early part of the 1960’s. An addendum to 
Shanahan’s paper is scheduled to appear in the Classroom Notes section and it will 
examine some of these comments in more detail. 

Richard Johnsonbaugh has noted that the technique used by J. P. Tullin A dis- 
covery approach to e (February, 1973, p. 193) has also been set forth by Flanders, 
Korfhage and Price in A First Course in Calculus with Analytic Geometry, pp. 
204—5 and in Calculus, pp. 94-7. He has also pointed out that the approach taken by 
R. B. Darst in Simple proofs of two estimates for e (February, 1973, p. 194) is similar 
to that given by H. G. Eggleston in Elements of Real Analysis, Cambridge, 1962, 
pp. 24-5. An article on e by Johnsonbaugh is scheduled to appear in the Classroom 
Nofes section in the near future. 


Algebra. Several readers have commented on the proof of a characterization of 
injective modules contained in the article by Azmi Hanna (March, 1973, pp. 297-8). 
H. F. Kreimer points out that the characterization appears as exercise 5, page 371, 
of MacLane and Birkhoff’s Algebra; James W. Brewer observes that the result in 


1116 


MATHEMATICAL NOTES 1117 


question appears as Lemma 7, page 187, of Rings and Fields (Kaplansky) and as 
Theorem 3.18, page 48, of Joseph Rotman’s Notes on Homological Algebra (Hanna’s 
article supplies the same proof given in Rotman’s book). 

J. C. Ault has mentioned a couple of transcription errors in his article Circle 
groups of nilpotent rings (written jointly with J. C. Watters) in the January, 1973 
MONTHLY (pp. 48-50). On page 49, g +h should be defined to be gh{m(g, h)}~*, and 
accordingly, on line 7 of page 50, — g is g ‘m(g,g~*) instead of g~'{m(g,g~*)}7'. 

Walter Rudin has pointed out that his proof of the unique factorization theorem 
in Z[i] (this MONTHLY, 68 (1961) 907-8) is essentially the same as that given by M. F. 
Ruchte and R. W. Ryden in A proof of uniqueness of factorization in the Gaussian 
integers (January, 1973, pp. 58-9). 

William H. Gustafson has supplied two references for the main result given by 
Claire Parkinson in Ambivalence in alternating symmetric groups (Feb., 1973, pp. 
190-2). These are J. L. Berggren, Finite groups in which every element is conjugate 
to its inverse, Pacific J. Math., 28 (1969) 289-93, and A. Kerber, Representations 
of permutation groups I, Springer-Verlag, Lecture Notes, 240 (1971) 14. J. L. Bren- 
ner also writes that the main result of Parkinson’s article follows from the known 
result that, in A,, a permutation and its inverse are conjugate, except in the case that 
n > 2 and the parts of the cycle decomposition of the permutation are unequal and 
odd and an odd number of them are congruent to 3 modulo 4. Brenner does not sup- 
ply us with a reference for this result, but we note that Kerber’s volume contains a 
treatment. 

The main result of the article on convex matrix functions by M. H. Moore (April, 
1973, pp. 408-9) has a long history. It has previously appeared in each of the following 
articles: A multivariate generalization of Tchebichev’s inequality, by P. Whittle, 
Quart. J. Math., (2) 9 (1958) 232-240; A multivariate Tchebycheff inequality, by 
J. Olkin and J. W. Pratt, Ann Math. Stat., 29 (1958) 226-234; On the bias of functions 
of the characteristic roots of a random matrix, by T. Cacoullus and I. Olkin, Bio- 
metrika 52 (1965) 87-94; A note on the expected value of the inverse of a matrix, by 
T. Groves and T. Rothenburg, Biometrika, 56 (1969) 690; On the expectation of the 
inverse of a matrix, by V. K. Srivastava, Sankhya (Series A) 32 (1970) 336. This 
information was transmitted by M. D. Perlman and I. Olkin. 

In his historical article on the invariance of parity of permutations (August~ 
September, 1972, pp. 776-9), T. L. Bartlow makes the following statement. ‘“When a 
product of cycles is multiplied by a transposition, the number of cycles of the product 
is either one greater or one less.’’ He ‘‘proves’’ the statement by simply writing down 
a formula. J. L. Brenner objects, on logical grounds, that Bartlow should establish 
his formula by induction, stating clearly what assumptions concerning permutations 
are being made. 


Analysis. In their paper A weak parallelogram law for |, (November, 1972, pp. 
1012-1015) W. L. Bynum and J. H. Drew attribute the inequality 


1118 ROBERT GILMER AND DAVID ROSELLE [December 


(|| + [a b? + [ll] — [ol Ps | + vl? + Ie — v]?s 1 < pS 2.x, ye, to O. Han- 
ner in his article in Ark. Mat., 3 (1956) 239-244. But R. Askey points out that 
Hanner indicates the preceding inequality is due to A. Beurling; Hanner repeats 
Beurling’s proof in his paper. 

Jack Vilms observes that the orthogonality condition of R. K. Williams in his 
March, 1973 note on conformality of mappings is quite natural, when placed in the 
proper perspective. A sketch of Vilms’ approach is the following. Interpret conformal- 
ity of a mapping f: R — Rin terms of conformality of the Jacobian matrix of f. Then 
taking, in Williams’ notation, 


tof ote [Deft oe [th 


Vilms obtains Williams’ criterion. 

The December, 1972 MONTHLY contained an article by M. A. Golberg on the 
derivative of a determinant. P. Lancaster and J. Rokne indicate that the result Golberg 
labels ‘Main Theorem’ has a previous history. In his article The evaluation of deter- 
minants by the method of variation of parameters, Soviet Math. Doklady I (1960) 
316-9, D. F. Davidenko attributes the result to A. Halanai; a proof of the theorem 
that may be more accessible appears as Appendix II of Lancaster’s paper Algorithms 
for lambda-matrices, Numerische Math., 6 (1964) 388-394. Golberg’s Main Theorem 
also appears as exercise 7, page 195, of Lancaster’s book Theory of Matrices; his 
corollary 2 appears as exercise 8 on the same page. 


General. J. G. Mauldon calls attention to an interesting source that is likely to be 
inaccessible to most MONTHLY readers —a high school trigonometry textbook, entitled 
Advanced Trigonometry, by C. V. Durell and A. Robson. Mauldon states that the 
elegant style of the book, which was published in London in 1930, was an inspiration 
to many English school children. The book is mentioned here because of its relation 
to two articles that appeared in the April, 1973 MONTHLY, one by I. Papadimitriou 
and the other by T. M. Apostol. Papadimitriou gives a simple proof of the formula 
Lk? = 27/6, and Apostol generalizes to the series ¥;°.,k 7". In Example 7, 
page 209, Durell and Robson present an elementary proof of the equality 
X= k~? = n7/6; their methods are the same as those of Papadimitriou and Apostol. 
Mauldon indicates that subsequent exercises in Durell-Robson give other results of 
Apostol’s paper. 


“Topology. Solomon Leader and Ernest Michael have (independently) observed 
that Theorem 1 of the paper A theorem on set inclusion in metric spaces, by James 
Heinen and Albert Wilansky, in the January, 1973 MONTHLY carries over to a general 
topological space X that is not necessarily metrizable. Michael states that metrizabil- 
ity is used in the proof only in proving that D, is nonempty, and there the assumption 
is easily avoided. 


1973} MATHEMATICAL NOTES 1119 
A PROBLEM ON SERIES 
G. J. O. JAMESON, Universitat Innsbruck 


For a sequence x of real or complex numbers, we denote by x(n) the nth term 
of x. We prove the following result. 


THEOREM. If x | x(n) | is convergent and %°_,x(kn) = 0 for each positive 
integer k, then x is the zero sequence. 


Proof. Let p, be the nth prime greater than 1, and let R, be the set of positive 
integers that are not divisible by any of p,,--:,p,. Note that the first two members 
of R, are 1 and p,,,. We show that 
(1) x x(j)=0 


JERn 


for each n. From this it follows that | x(1) | < 2d? ., | x( j)| for all n, and therefore 
that x(1) = 0. A similar argument shows that, for any k, Yer, x(kj) = 0, and 
hence that x(k) = 0. 

For a bounded sequence y, write <x,y> = L x(n)y(n). Let a, be the sequence 
with 1 in place nk (n = 1,2,---) and 0 elsewhere. The hypothesis states that 
<x, a,)> = 0 for each k. 

Choose n, and for | < r < n, let T, be the set of products of r distinct primes 
chosen from p,,°::,p,. Let 


b=a,+ dh (-1Y 2 qy. 

r=1 keTr 
We show that b(j) is 1 for j¢R, and 0 for other j; statement (1) then follows. If j 
is in R,, then a,(j) = 0 for all k in J?_,T7,, so b(j) = a,(j) = 1. Suppose now 
that j is not in R,, and let q,,---,q,, be the primes that divide j and are not greater 
than p,. Let U, be the set of products of r distinct primes chosen from q1,°-'; Gm: 
Then U, has (") members, and a,(j) = (—1)" for each k in U,. These, together 
with a,, are the only a, that make a non-zero contribution to b(j), which is there- 
fore equal to 


1 + zy (") = (1—1)" = 0. 


This completes the proof. 


PROBLEM. Does the conclusion hold if we drop the assumption that & | x(n) | 
is convergent? 


Added in proof: This problem has been solved negatively by D. H. Fremlin. 


RESEARCH PROBLEMS 
EDITED BY RICHARD GUY 


In this Department the Monthly presents easily stated research problems dealing with notions 
ordinarily encountered in undergraduate mathematics. Each problem should be accompanied by 
relevant references (if any are known to the author) and by a brief description of known partial 
results. Manuscripts should be sent to Richard Guy, Department of Mathematics, Statistics, and 
Computing Science, The University of Calgary, Calgary 44, Alberta, Canada. 


MONTHLY RESEARCH PROBLEMS, 1969-73 
RICHARD K. Guy, The University of Calgary 


This continues the article written with Victor Klee [1971, 1113], referred to 
hereafter as GK. References to papers appearing in this section of the Monthly are 
in brackets containing the year and first page; they are not listed at the end. Other 
references are in parentheses; a date or tbp (to be published) refers to the list at the 
end of the article; wre (written communication) is used when there are no (known) 
publication plans. 

The editor welcomes comments on the Research Problems section. It has been 
suggested that since the MONTHLY is not a research periodical, research problems are 
out of place. On the other hand, there is the view that it is important for students and 
teachers to realize that though they may not (yet) feel capable of mathematical 
research, there are plenty of unsolved problems that are well within their compre- 
hension. Many amateurs are attracted to the subject, and many successful researchers 
first gained their confidence by examining intuitive problems in areas such as 
euclidean geometry, theory of numbers and, latterly, combinatorics and graph 
theory, where it is possible to understand questions (even to formulate them) and to 
obtain original results without a deep prior theoretical knowledge. Many years ago 
a friend telephoned me that he had got into an argument with his wife as to whether 
there was an infinity of primes. I find it significant that people without special 
mathematical training should have initiated such a discussion; moreover that I was 
able to outline Euclid’s proof to the enquirer’s satisfaction over the telephone. 

A point that occasionally arouses controversy is that the Research Problems 
section does not publish solutions. Occasionally, when space permits, solutions do 
appear, usually in the Mathematical Notes section, but normally papers should be 
submitted to journals regularly devoted to research. It is hoped that the problems 
appearing in this section are of interest to a reasonably wide band of the broad 
spectrum of MONTHLY readers, many of whom are undergraduates, college teachers 
and others, by no means all active in research. When a problem is solved, the solution, 
especially the first one, may be complicated, or prolix, or sophisticated, or just dull, 
and unsuitable for publication in the MONTHLY. To take an extreme example, the four 


1120 


RESEARCH PROBLEMS 1121 


color conjecture is suited to these pages on grounds of intelligibility, though not on 
others; the solution, when it appears, may be quite unsuited. However, I hope that 
those who solve or make partial progress with problems in this section will let me have 
offprints or preprints, or other comments and references, so that future updating 
articles can be as complete and accurate as possible. 

Carmichael’s conjecture, concerning the non-uniqueness of solutions of the 
equation $(x) =n, where ¢ is Euler’s totient function, was discussed by Klee [1969, 
288]. Grosswald (1973) has proved that the conjecture holds, except possibly for 
integers n divisible by 32; this is a slight strengthening of a result of Donnelly (tbp), 
who also shows that if x, is the least integer x for which ¢(x) = n has a unique 
solution then n = $(x.) = 0(mod 2!4). 


Larman (1971) answered a question of Klee [ 1969, 678], by proving that all convex 
borel sets in E* can be generated in a borelian manner within the realm of convexity. 
He now writes that the same answer has recently been proved in E’, d = 2, by 
David Preiss of Prague. 


Kronk [1969, 809] asked if there exists a hypotraceable graph. J. D. Horton 
(tbp) has constructed one from five copies of the Petersen graph (Figure 1). 


Kronk [1969, 1045] discussed the conjecture of Nash-Williams and Plummer that 
the square of every non-separable graph is hamiltonian. A number of proofs have 
been combined in a paper by G. Chartrand, A. M. Hobbs, H. A. Jung, S. F. Kapoor 
and C. St. J. A. Nash-Williams (tbp), whose main results are that the square of 
every 2-connected graph is hamilton-connected and that the square of every 2- 
connected graph of order at least 4 is 1-hamiltonian. This work stems from that of 
Fleischner (tbp); an exact reference can now be given to Fleischner and Kronk (1972). 
Chartrand writes that Zaks (1972) has shown that if G is 2-connected, G* need not 
be 2-hamiltonian, and that Hobbs (1973) has shown that G? is vertex pancyclic. 


1122 RICHARD K. GUY [December 


I am indebted to Golomb, Rosa, Sheppard, Stanley and Stanton for helpful corre- 
spondence concerning Duke’s [1969, 1128] paper on Ringel’s (1963) tree-packing 
problem. Sheppard (wrc) has given a counterexample to the sufficiency condition 
mentioned in GK as having been suggested by Haggard and McWha. We referred 
in GK: to Kotzig and Rosa (1970), where Rosa (1966) would have been more ap- 
propriate. The former define a magic valuation as a labelling of the vertices and 
edges of a graph so that the total of the labels on each edge and its incident vertices 
is (the same ‘magic’) constant. What is needed here was called by Golomb (1972) a 
graceful numbering; a labelling of the vertices with non-negative integers so that 
the absolute differences of the ends of the e edges comprise the first e positive integers. 
There is no loss of generality in assuming that one of the vertices is labelled zero; 
such a graceful numbering was called a B-valuation by Rosa (1966). He used the name 
a-valuation if, in addition, there was a number lying between (not strictly) the labels 
of the two vertices incident with any edge; 1.e., only bipartite graphs can have a- 
valuations. Trees are bipartite, but not all trees have a-valuations; Rosa gives the 
example of Figure 2. However, he found f-valuations (graceful numberings) for all 


Fic. 2. 


trees with 16 or fewer vertices. Kotzig (1972b) proved the following important result: 
given a tree, form an infinite family of trees by replacing each edge in turn by a path 
of arbitrary length; then such a family contains only a finite number of trees with no 
a-valuation. In a recent paper Stanton and Zarnke (tbp) show that if S and T are 
gracefully numbered trees, then the balanced trees formed from S and T can also 
be so labelled; a balanced tree of type I (resp. II) being formed by attaching a copy 
of T to every vertex (resp. with one exception) of S. Kotzig (1972a) discusses the 
relationship between a-valuations and magic valuations, but this paper is more 
concerned with the latter. A further use of magic labelling is that of Stanley (tbp) 
who labels the edges and requires the edge-sums at each vertex to be the same. This 
last is also the usage of Stewart (1966) whose interest derived from a problem of 
Sedlaéek (1963). Murty [1971, 1000] also used the word ‘magic’ on the encourage- 
ment of the present writer; we do not know of any progress with his problem. 

The asymptotic result on the problem discussed by Klee [1970, 63] on the longest 
d-dimensional snake has now appeared as Wyner (1971). Adelson, Alter, and Curtz 
(1973) have found new snakes of lengths 48, 86 and 128 in 7, 8 and 9 dimensions; 
the last two are longer than any previously known. Klee reports that Preparata and 
Nievergelt (tbp) have obtained new results for this problem. 


1973] RESEARCH PROBLEMS 1123 


The paper of Subbarao, Cook, Newberry and Weber (1972) on unitary perfect 
numbers [ 1970, 389] has now appeared, as has that of Usiskin and Wayment (1972) 
on partitioning a triangle into 5 triangles similar to it [1970, 867]. The more general 
problem [1971, 1118] of partitioning a triangle into 5 similar triangles has been 
examined by R. B. Killgrove (wrc) and in exhaustive detail by Don Coppersmith 
(wrc). There are evidently 10 essentially different configurations: Figure 3 is any 
isosceles triangle dissected into right triangles; Figures 4 and 5 are equilateral 
triangles, one dissected into right triangles, the other into triangles containing an 
angle of 120°; Figure 6 is a 90°, 60°, 30° triangle dissected into 120°, 30°, 30° ones. The 
other 6 cases are dissections of various scalene triangles of prescribed shape, 2 of 
them containing an angle of 60°; in these 2 the constituent triangles each contain an 


angle of 120°. 


Fic. 3. Fic. 4. 
Fic. 5. FIG. 6 


The solution of Fejes Téth’s [ 1970, 869] illumination problem has been completed 
by Bruce Henry (1973). 

Wills writes that the paper of Bokowski and Odlyzko (1973) is relevant to his 
problem [1971, 47] on lattice points and volume /area ratio of convex bodies, and 
that Bokowski, Hadwiger and Wills (1972) have solved the related problem of 
finding a lower bound for the number of lattice points, G(K), of a convex body. 
There are two conjectures concerning the problems of finding an upper bound for 
G(K) and for G(K), the number of lattice points on the boundary. The first, 


(2) G(K) x ( *) WAR) 


W: 


t 


where W,(K) is Minkowski’s ‘‘Quermassintegrale’’ and w; is the volume of the 
i-dimensional unit sphere, is mentioned in Wills (1973). The second, for G(K), is 
due to Hadwiger and Wills; the sum is taken over only odd values of i and then 


1124 RICHARD K. GUY [December 


doubled. Partial results have been obtained by Wills in collaboration with Bokowski 
(1973), McMullen (1973) and Hadwiger (1973). 

Rosenfeld’s problem [ 1971, 49] to find the number of graphs on n vertices with 
k cliques has been solved for k = 3 by P. McMullen (wrc); the number may be written 


[(n + 3)(6n* — 18n? + 34n? — 62n + 165 + 45(—1)") /1440], 


where brackets denote greatest integer not greater than. 

Doran’s (1972) solution to his problem [1971, 178]: if A is a symmetric *-algebra 
without identity, is the *-algebra obtained from A by adjoining an identity symmetric; 
has now appeared. 

Lind writes that Cahen informs him that his problem [1971, 179] on which 
polynomials map the algebraic integers into themselves, goes back to Ostrowski 
(1919) and Poélya (1919). Cahen has sent a description of his work (1972, 1973), that 
of Chabert (1971, 1972, 1973), of Brizolis (1973, tbp), of Gunji and McQuillan 
(1969, 1970) and of McClure (1971). 

Hering’s (1973) paper solving his problem [1971,275] on inequalities has now 
appeared, and Singmaster’s (1973) paper concerning his problem [ 1971, 385] on 
repetitions of integers as binomial coefficients is also appearing. 

The downfall of Duke’s [1971, 386] conjecture concerning the genus and Betti 
number of a graph is mentioned by Nordhaus (1972). His papers with Stewart and 
White (1971) and with Ringeisen, Stewart and White (1972) are related. Duke 
mentions two other papers of Ringeisen (1972) and states his belief that he can use 
the work of Martin Milgram and Peter Ungar, reported in GK [1971, 1120], to 
show that the best possible relation between the Betti number and genus of a graph 
is B = y(2 + c/logy) for some constant c. 

Witsenhausen (tbp) has improved Rosenthal’s bounds, quoted by Bolker [1971, 
529] in discussing the zonoid problem. 

Payne writes concerning his article [1971, 659] on linear transformations of a 
finite field that his paper (1972) has now appeared, and that he has solved (1971) the 
problem (not the general problem) mentioned there. 

Chui (tbp) has a paper related to his problem [1971, 779] on fields, due to point 
masses, aS does Newman (tbp). 

Herda’s conjecture [1971, 888] that a circle maximizes the minimum pseudo- 
diameter has been confirmed by Ault (1974), Batten (wrc), Chakerian (1974), Davies 
(wrc), Fink (wrc), Goodey (1972), Johnson (wrc), Lipskie (wrc), Short (wrc), Wente 
(wr¢) and Witsenhausen (1972); see Herda’s (1974) article for details. A related 
problem was considered by Besicovitch (1961), Danzer (1963), Koenen (1971) and 
Nash-Williams (1972). 

Smith writes that the two conjectures of his paper with Kumin [1972, 157] have 
been answered affirmatively by Joan P. Hutchinson (tbp), Frank Owens and Louis 
H. Rowen (tbp). 


1973] RESEARCH PROBLEMS 1125 


Papers related to Peterson’s question [ 1972, 505]: do self-intersections characterize 
curves of constant width, are those of Goodey (tbp) and Peterson himself (tbp). 

H. Kharaghani (wrc) has studied the Hadamard maximum determinant problem 
discussed by Brenner and Cummings | 1972, 626]: the reference to Schmidt (1970) 
has wrong page numbers and a reference to Hall and Ryser (1951) might have been 
more appropriate than the one to Hall (1956). The bound given by Popoviciu (1937) 
is not, as stated, sharper than that of Barba (1933). As a start to the related problem 
that they mentioned, in which matrix entries are restricted to a sector | 6 SO 
<x of the unit circle, Brenner (wrc) and Cummings show that for n = 2 the maxi- 
mum modulus of the determinant is max{2,2 sin 20)}. 

Doran writes to point out that the problem in his paper [ 1972, 762], does there 
exist more than one Banach *-algebra with discontinuous involution, has some 
trivial solutions: e.g., adjoin an identity twice to Bonsall’s example and take finite 
direct sums, or take the tensor product with a finite-dimensional algebra. These two 
constructions can be carried out before or after adjoining an identity, but are of more 
interest in the latter case. 

The footballers of Croam, discussed by Biggs [ 1972, 1020] do not need to play 
on Sunday. The conjecture that O, is never edge-k-colorable is false. Meredith 
and Lloyd (1972, tbp) have shown that k colors suffice for the edges of O, when 
k = 5 and 6. The case k = 5 was also settled by G. Szekeres. In fact Meredith and 
Lloyd show that O, is the union of 2 hamilton circuits, O, the union of 2 hamilton 
circuits and a 1-factor and O, the union of 3 hamilton circuits, so that Biggs is now 
tempted towards the opposite conjecture, that for k > 3, O, contains [4k] edge- 
disjoint hamilton circuits. 

There is a misprint in the paper of Erdés and Guy [1973, 52]; the ratio of the 
crossing number of the complete graph to (7) (not n*) tends to a limit between 3/10 
and 3/8. D. Singer (wrc) has shown that the rectilinear crossing number of the 
complete graph on 10 vertices satisfies V(K,9) S 62, and he and H. F. Jensen (wrc) 
have independently improved Jensen’s (1971) upper bound. From Jensen’s work 
it is possible to deduce that 


(K,) S [(n—1) (n—3)?5n — 4)/312] 


and each notes that the limit corresponding to that mentioned above is at most 
5/13. In view of Singer’s discovery, it is now more plausible to conjecture that this 
upper bound can be further improved. We reported that the number of non-isomor- 
phic optimal drawings of K, is ‘about 200’. This was based on the fact that there 
are 181 drawings in which the responsibility of (total number of crossings on edges 
incident with) at least one vertex is 18, and the belief that there were few drawings 
with all responsibilities smaller. A. Uytterhoeven (wrc) of Baal, Belgium, has made 
an extensive search for such drawings. He confirms the number 181, but has found 
no fewer than 230 drawings with responsibilities less than 18, 4 of them with all 


1126 RICHARD K. GUY {December 


responsibilities 16. He does not claim completeness, but says that the list of 411 
‘est a peu prés complete’. 

By way of repeating the plea for help with keeping the section up-to-date by 
readers’ comments, references, preprints and offprints, I conclude by saying that 
most of this article is owed to a large number of helpful correspondents; not only 
those mentioned by name or for whom space does not permit a mention, but also 
editors of journals and referees, for whom confidentiality demands that their unre- 
warding work goes unacknowledged but not unappreciated. 


References 


L. E. Adelson, R. Alter and T. B. Curtz (tbp), Long snakes and a characterization of maximal 
snakes on the d-cube, Congressus Numeratium VIII Proc. 4th S. E. Conf. on Combinatorics, Graph 
Theory and Computing, Boca Raton, 1973. 

, Computation of d-dimensional snakes, Congressus Numeratium VIII Proc. 4thS. E. Conf. 
on Combinatorics, Graph Theory and Computing, Boca Raton, 1973. 

R. Ault, Metric characterization of circles, this MONTHLY, 81 (1974) to appear. 

G. Barba, Intorno al teorema di Hadamard sui determinanti a valore massimo, Giorn. Mat. 
Battaglini, 71 (1933) 70-86. 

A.S. Besicovitch, A problem on a circle, J. London Math. Soc., 36 (1961) 241-244. 

J. Bokowski, H. Hadwiger and J. M. Wills, Eine Ungleichung zwischen Volumen, Oberflache 
und Gitterpunktanzahl konvexer K6rper im n-dimensionalen euklidischen Raum, Math. Z., 127 
(1972) 363-364. 
and A. M. Odlyzko, Lattice points and the volume/area ratio of convex bodies, Geom. 
Dedicata, 2 (1973). 
and J. M. Wills, Upper bounds for the number of lattice points of convex bodies, this 
MONTHLY, 81 (1974) to appear. 

D. Brizolis, Ideals of rings of integer valued polynomials, Ph. D dissertation, U. C. L. A., 1973. 

———(tbp) On the ratios of integer-valued polynomials over any algebraic field. 

P. J. Cahen, Polynémes a valeurs enti¢éres, Canad. J. Math., 24 (1972) 747-754. 

, Polynémes a valeurs entiéres, Thése, Paris, 1973. 
and J. L. Chabert, Coefficients et valeurs d’un polynéme, Bull. Sci. Math., 95 (1971) 


295-304. 

J. L. Chabert, Anneaux de polynémes a valeurs entiéres et anneaux de Fatou, Bull. Soc. Math. 
France, 99 (1972) 273-283. 
, Anneaux de polyndémes a valeurs entiéres, Colloq. d’alg. Rennes, No. 8 (1972). 

————, Anneaux de polynémes 4a valeurs entiéres et extensions de Fatou, Thése, 1973. 

G. D. Chakerian, A characterization of curves of constant width, this MONTHLY, 81 (1974) to 
appear. 

G. Chartrand, A. M. Hobbs, H. A. Jung, 8S. F. Kapoor and C. St. J. A. Nash-Williams (tbp), 
The square of a block is hamiltonian connected, J. Combinatorial Theory. 

C.K. Chui (tbp), On approximation in the Bers spaces, Proc. Amer. Math. Soc., 

L. Danzer, A characterization of the circle, Proc. Sympos. Pure Math., VII, Convexity, Amer. 
Math. Soc., 1963, 99-100. 

H. Donnelly (tbp), On a problem concerning Euler’s phi-function, this MONTHLY, 80 (1973) 
1029-1031. 

R.S. Doran, A generalization of a theorem of Civin and Yood on Banach*-algebras, Bull. 
London Math. Soc., 4 (1972) 25-26. 


1973} RESEARCH PROBLEMS 1127 


H. Fleischner (tbp), On spanning subgraphs of a connected bridgeless graph and their application 
to DT-graphs, J. Combinatorial Theory, 16B (1974). 
, The square of every two-connected graph is Hamiltonian, ibid. 

H. Fleischner and H.B. Kronk, Hamiltonsche Linien in Quadrat briickenloser Graphen mit 
Artikulationen, Monatsh. Math., 76 (1972) 112-117. 

S. W. Golomb, How to number a graph, in R. C. Read (ed.), Graph Theory and Computing, 
Academic Press, 1972, 23-37. 

P. R. Goodey, A characterization of circles, Bull. London Math. Soc., 4 (1972 )199-201. 
(tbp) Intersections of circles and curves of constant width. 

E. Grosswald, Contribution to the theory of Euler’s function P(x), Bull. Amer. Math. Soc., 
79 (1973) 337-341. 

H. Gunji and D. L. McQuillan, On polynomials with integer coefficients, J. Number Theory, 
1 (1969) 486-493. 


, On a class of ideals in an algebraic number field, J. Number Theory, 2 (1970) 


207-221. 

H. Hadwiger and J. M. Wills, Uber Eikorper und Gitterpunkte in gewohnlichen Raum, Geom. 
Dedicata, 2 (1973). 

M. Hall, A survey of difference sets, Proc. Amer. Math. Soc., 7 (1956) 975~986, 

M. Hall and H. J. Ryser, Cyclicincidence matrices, Canad. J. Math., 3 (1951) 495-502. 

B.R. Henry, Solution of Fejes Téth’s illumination problem, this MONTHLY, 80 (1973) 409-410. 

H. Herda, A characterization of circles and other closed curves, this MONTHLY, 81 (1974) to 
appear. 

F. Hering, Eine Verallgemeinerung der Ungleichung vom arithmetischen und geometrischen 
Mittel, Montash. Math., 77 (1973) 31-42. 

A. M. Hobbs, The square of a block is vertex pancyclic, Graph Theory Newsletter, W. Mich. 
Univ., 2 #5 (1973) 2. 

J. P. Hutchinson (tbp), Eulerian graphs and polynomial identities for sets of matrices; see also 
Matrices satisfying [A41; Ao, ..., An] = 0, AMS Notices, 19 (1972) A729, 

H. F. Jensen, An upper bound for the rectilinear crossing number of the complete graph, J. 
Combinatorial Theory, 10B (1971) 212-216. 

W. Koenen, Characterizing the circle, this MONTHLY, 78 (1971) 993~996. 

A. Kotzig, On vertex-valuations and magic valuations of certain bichromatic graphs, Publica- 
tions C. R. M. #233, Montreal, Oct. 1972. 
, On certain vertex valuations of finite graphs, Publications C. R. M. #236, Montreal, 


Oct. 1972. 
and A. Rosa, Magic valuations of finite graphs, Canad. Math. Buil., 13 (1970) 451-461. 

D.G. Larman, The convex borel sets in R3 are convexly generated, J. London Math. Soc., 
(2), 4 (1971) 5-14. 

C.R. McClure, Common divisors of values of polynomials, J. Number Theory, 3 (1971) 33~34. 

P. McMullen and J. M. Wills, Zur Gitterpunktanzah! auf dem Rand konvexer KGrper, Monatsh. 
Math., 77 (1973). 

G. H.J. Meredith and E. K. Lloyd, The hamiltonian graphs O4 to O7, in D. J. A. Welsh (ed.), 
Combinatorics, I. M. A. 1972, 229-236. 
, (tbp), The footballers of Croam, J. Combinatorial Theory. 

C. St. J. A. Nash-Williams, Plane curves with many inscribed rectangles, J. London Math. 
Soc., 5 (1972) 417~418. 

D.J. Newman (tbp), A lower bound for an area integral. 

E. A. Nordhaus, On the girth and genus of a graph, in Graph Theory and Applications (Proc. 
Conf. W. Mich. U. ), Springer, 1972, 207-214. 


1128 RICHARD K. GUY 


, B. M. Stewart and A. T. White, On the maximum genus of a graph, J. Combinatorial 
Theory, 11B (1971) 258-267. 
, R. D. Ringeisen, B. M. Stewart and A. T. White, A Kuratowski-type theorem for the 
maximum genus of a graph, J. Combinatorial Theory, 12B (1972) 260-267. 

A. M. Ostrowski, Uber ganzwertige Polynome in algebraischen Zahlk6rpern, J. Reine Angew. 
Math., 149 (1919) 117-124. 

S.E. Payne, A complete determination of translation ovoids in finite desarguian planes, Atti 
Accad. Naz. Lincei, Rend. Cl. Sci. Fis. Mat. Natur., 51 (1971) 328-331. 
, Generalized quadrangles as amalgamations of projective planes, J. Algebra, 22 (1972) 


120-136. 

B. B. Peterson (tbp), Intersection properties of curves of constant width, Illinois J. Math. 

G. Polya, Uber ganzwertige Polynome in algebraischen Zahlkérpern, J. Reine Angew. Math., 
149 (1919) 97-116. 

J. Popoviciu, Remarques sur le maximum d’un déterminant dont tous les éléments sont non 
négatifs, Bul. Soc. Sti. Cluj, 8 (1937) 572-582. 

R. D. Ringeisen, Determining all compact orientable 2-manifolds upon which K,, ,, has 2-cell 
embeddings, J. Comb. Theory, 12B (1972) 101-104. 

, Upper and lower embeddable graphs, in Graph Theory and Applications, Proc. Conf. 
W. Mich. Univ., 1972, 261-268. 

G. Ringel, Problem 25, Theory of Graphs and its Applications, Proc. Sympos. Smolenice 1963, 
Prague, 1964, 162. 

A. Rosa, On certain valuations of the vertices of a graph, Theory of Graphs, Proc. Internat. 
Sympos. Rome, 1966, Gordon & Breach, N. Y., 1967, 349-355. 

L. H. Rowen, On classical quotients of polynomial identity rings with involution, Proc. Amer. 
Math. Soc., 40 (1973) 23-29; see also Standard identities for matrix rings with involution, AMS 
Notices, 20 (1973) A76, 

K. W. Schmidt, Lower bounds for maximal (0,1) determinants, SIAM J. Appl. Math., 19 (1970) 
440-442. 

J. Sedlaéek, Problem 27, Theory of Graphs and its Applications, Proc. Sympos. Smolenice 
1963, Prague, 1964, 163-164. 

D. Singmaster, Repeated binomial coefficients and Fibonacci numbers, Fibonacci Quart., 11 
(1973). 

R. Stanley (tbp), Linear homogeneous diophantine equations and magic labellings of graphs, 
Duke Math. J., 40 (1973). 

R. G. Stanton and C. R. Zarnke, (tbp), Labelling of ba’anced trees, Congressus Numeratium VIII 
Proc. 4th S. E. Conf. on Combinatorics, Graph Theory and Computing, Boca Raton, 1973. 

B. M. Stewart, Magic graphs, Canad. J. Math., 18 (1966) 1031~1059. 

M. V. Subbarao, T. J. Cook, R.S. Newberry and J. M. Weber, On unitary perfect numbers, 
Delta, 3 (1972) 22-26. 

Z. Usiskin and S. G. Wayment, Partitioning a triangle into 5 triangles similar to it, Math. Mag., 
45 (1972) 37-42. 

J.M. Wills, Zur Gitterpunktanzahl konvexer Mengen, Elem. Math., 28 (1973). 

H.S. Witsenhausen, On closed curves in Minkowski spaces, Proc. Amer. Math. Soc., 35 (1972) 
240-241. 


, (tbp), Metric inequalities and the zonoid problem, Proc. Amer. Math. Soc., 

A.D. Wyner, Note on circuits of spread & in the n-cube, I. E. E. E. Trans. Computers, C20 
(1971) 474. 

J. Zaks, Graph Theory Newsletter, W. Mich. Univ., 1 #4 (1972) 7. 


CLASSROOM NOTES 
EDITED BY ROBERT GILMER 


Material for this Department should be sent to David Roselle, Department of Mathematics, 
Louisiana State University, Baton Rouge, LA 70808. 


THE MINIMAL POLYNOMIAL OF A LINEAR TRANSFORMATION 
M. D. Burrow, Courant Institute of Mathematical Sciences 


1. Introduction. It seems that none of the textbooks on linear algebra gives a 
direct proof of the fact that the minimal polynomial m(x) of a linear transformation 
T is of degree less than or equal to the dimension of the vector space V on which 
T acts. The usual proof depends on the fact that m(x) divides the characteristic 
polynomial f(x), the degree of which is equal to the dimension of V, and for this 
one needs the Cayley-Hamilton theorem. In Theorem | we give a direct proof using 
mathematical induction on the dimension of V. The induction is brought into play 
by the use of quotient spaces and linear transformations induced on them by in- 
variant subspaces. Theorem 2 shows that if m(x) is a power of an irreducible poly- 
nomial p(x) of degree r, then r divides n, where n is the dimension of the vector 
space V. This leads to an expression for the characteristic polynomial f(x) in terms 
of the irreducible factors of m(x) in the general case. 


2. THEOREM 1. Let V be a vector space of dimension n over a field F. Let 
T:V—- V be a linear transformation. Then the minimal polynomial m(x), that 
is the monic polynomial of minimal degree for which m(T)V = 0, is of degree 
less than or equal to n. 


Proof. Suppose that dim V = 1. Then for any non-zero vector a in V we have 
V = Fa. It follows that Ta = ka, where k is some element of F. Hence (T — kDa = 0 
where I is the identity map on V. This shows that m(x) = x — k is the minimal 
polynomial of T. Since degm(x) = 1 we see that the theorem is true for the case 
n=1. 

Now, to make an induction on the dimension we assume that the theorem is 
true for all spaces W of dimension less than n. Let dim V = n, and suppose that « 
is a*non-zero vector in V. Then the n+ 1 vectors «, Ta, T?a,---,T"« are linearly 
dependent so that there is a set {do,a,,---,a,} of elements of F, not all of them 
zero, such that 


Agt +a,Tat+--+a,T"a = 0. 


Writing g(x) = dg ta,x +--+ +a,x", we see that degg(x) <n and g(T)a = 0. 
If g(T)V = 0, then, because m(x) is the minimal polynomial, we have deg m(x) 


1129 


1130 M. D. BURROW [December 


< degg(x) <n and so the theorem holds. Suppose now that g(T)V # 0. Let 
U = {a:g9(T)a = 0}. Then U is a proper subspace of V and dimU =r<n. By 
the inductional assumption there is a polynomial m,(x) of degree less than or equal 
to r such that m,(T)U = 0. Moreover, TU ¢& U, since g(T) commutes with T, 
so that U is an invariant subspace of V. Now consider the quotient space V/U. 
We have dim V/U = n—r <n and so, by the inductional assumption again, there 
is a polynomial m,(x) of degree < n—r such that m,(T)V/U = 0. This means 
that m,(T)V ¢ U, so that m,(T)m,(T)V = 0. 

Writing h(x) = m,(x)m,(x) we have h(T)V = 0 so that if m(x) 1s the minimal 
polynomial 


deg m(x) S deg h(x) = degm,(x) + degm,(x) Sr+n—-—r=n. 
Thus the theorem holds in all cases and the proof is complete. 


Norte: If m,(x) and m,(x) are the minimal polynomials of T restricted to U 
and V/U respectively, then h(x) = m,(x)m,(x) coincides with the minimal poly- 
nomial of T. 


THEOREM 2. Let V be a vector space of dimension n over a field F, and let T 
be a linear transformation on V. If the minimal polynomial m(x) = (p(x))*, 
where p(x) is irreducible and of degree r, then r divides n. 


Proof. Let B = {a,, ,--,a,} be a basis of V. We assert that there is one of 
these vectors, which, with no loss of generality, may be taken to be a,, such that 
the set G = {a,, Ta, +, T**a,} is linearly independent. 

Suppose the statement is false; then for every a in V, p(T)*a = 0 for some 
jx <8. This is so because the annihilating polynomial of minimal degree (given 
here by the assumed dependence) of a vector divides any other annihilating poly- 
nomial, and in particular, then, divides (p(x))*. Thus p(T)°-*V = 0, acontradiction. 

Let U be the subspace of V generated by the set G. Then dim U equals rs. If 
U = V, then rs = n so that r divides n and we are finished. To complete the proof 
we use an induction on n. Suppose that U ¥ V. First note that TU ¢€ U, since 
for every j <rs—1, T(T’a,) = T/**a, isa basis element of U, whereas (p(T))°a, =0 
gives T’‘x, in terms of the basis elements. We go now to the quotient space V/U. 
Since TU ¢ U we can consider the transformation T, induced on V/U by T. Since 
(p(T,))* also annihilates V/U, the minimal polynomial of T, is (p(x))' for some 
t<s. Now dimV/U =n-—rs <n so that the inductional hypothesis makes r 
divide n — rs and this implies that r divides n, completing the proof. 


COROLLARY. Let V be a vector space of dimension n over a field F and let T be 
a linear transformation on V. If the minimal polynomial m(x) = (p(x))*, where 
p(x) is irreducible and of degree r, then determinant (xI — T) = (p(x))””. 


Proof. Assume that the statement is true for all spaces of dimension less than n. 


1973] CLASSROOM NOTES 1131 


The case n = 1 is trivial. Let T, and T, be the linear transformations induced on 
U and V/U respectively by T. Since dim U <n and dimV/U <n, the inductional 
hypothesis gives 


det(xI — T,) = (p(x))"*" and det(xI — T,) = (p(x))"~"?”" 
But then 
det(xI — T) = det(xI — T,)det(xI — T>) = (p(x))””. 


REMARK. det(xI — T) = f(x) is, of course, the characteristic polynomial of T. 
The corollary extends at once to the following: 


THEOREM 3. Let the minimal polynomial of T be 
mx) = (pi(x))"*ax))? + ix)”, 


where each p,x) is irreducible of degree r;. Let V; be the null space of (px))* 
and let n; be the dimension of V;, then for i = 1,2,---,k we have that r, divides n, 
and 


f(x) = (p(x) + (pony . 


This is immediate from Theorem 2 and its corollary, since V is the direct sum 
of the V, and each (p,(x))"' is the minimal polynomial of the transformation T;, in- 
duced on V; by T. By Theorem 1 we have r,s; < n; so that s; < n,/r; and hence 
m(T)V = 0 implies that f(T)V = 0. Thus the Cayley-Hamilton theorem follows 
aS a consequence. 

Note that the vector a, of Theorem 2 is annihilated by no polynomial of degree 
<rs, and in fact that its order (i.e., annihilating polynomial of minimal degree) 
is the minimal polynomial m(x). For the general case, in each V, there is a vector 
a; whose order is the minimal polynomial (p,x))*' and the vectors «/ given by 
wo =a,t--t+a;, for j= 1,2,-+-,k, have orders []/2,(p(x)): Thus a = a 
has order m(x), the minimal polynomial of T. 


ANOTHER PROOF OF THE RATIONAL DECOMPOSITION THEOREM 
H. G. Jacos, University of Massachusetts 


A. Introduction. The Rational Decomposition Theorem states that a finite 
dimensional vector space under a linear transformation decomposes into a direct 
sum of cyclic subspaces. There are at least two rather well-known proofs of this 
theorem. The more elegant one applies, to the case of a single linear transformation, 
the theorem that a finitely generated module over a principal ideal domain is the 
direct sum of cyclic submodules [3, p. 386]. The more direct proof involves showing 
that a cyclic subspace of maximum dimension is a direct summand in a decomposition 


1132 H. G. JACOB [December 


into invariant subspaces [4, p. 309]. The purpose of this note is to give a somewhat 
different argument for the latter result. It is based on some simple facts about the 
dual space and the existence of a vector whose minimal polynomial equals that of 
the linear transformation. The technique used extends that employed by C. W. 
Curtis in showing that a nilpotent linear transformation yields a decomposition into 
cyclic subspaces [1, p. 192]. 


2. Preliminaries. (1) Let T be a linear transformation on a finite dimensional 
vector space V over a field F. The minimal polynomial m(x)eF[x] of T needs no 
explanation [4, p. 306]. The minimal polynomial m,(x) of a vector yeV relative 
to T is the monic polynomial of least degree such that m,(T)y = 0. It follows 
that m,(x) divides m(x). The existence of yeV such that m,(x) = m(x) can be 
argued in two steps. First, it is immediate when m(x) = f(x)* where f(x) is an ir- 
reducible polynomial and e a nonnegative integer. Second, if m(x) = f,(x)'f.(x)?-: 
f(x)” where f(x) is irreducible then V = V,; ®V, @--- @ V, where V,, for 1 Si <r, 
is T-invariant and the minimal polynomial of T restricted to V, is f(x). The vector 
Y=yz~tyot+-: + y, where y,eV,; and m,(x) = f(x)" is the desired element. 
The theorem that V decomposes into the direct sum of T-invariant space V, 1s some- 
times called the Primary Decomposition Theorem [2, p. 220]. 

(ii) The space V* of all linear forms on V, linear transformations from V to 
F, is called the dual space of vy. For y*e¢V* and yeV we use the symbol <y*, y> 
to denote y*(y)eF. Thus 


<yity3,y> = <yt,y> + <y3,y> for yf and yz in V* and y in V. 
<y*,¥1 + Yor = <¥*, VD + <y*, o> for y*eV* and y, and yz in V. 
<ay*, y> = ady*,y> = <y*,ay> for y*eV* and yeV and aeF. 


By associating y to the linear form <y*, y> on V* we can identify V naturally with 
v** = (V*)*. Moreover, for a basis y,,y2,°°:,y, Of V there exists a basis 
yi, ya, ye, called the dual basis of y,,y2,-::,y, such that 


<yi, y;> = 0,, the Kronecker delta. 


Since <y*, Ty>, for y* fixed, is a linear form there exists an element denoted by 
T*y* in V* such that 
(T*y*, y> = Ty). 


It is readily seen that the mapping T* which maps y* to T*y* is a linear transfor- 
mation on V*. If f(x)eF[x] then 


S(T) y*, v2 = YAP) yD - 


From this it follows that T and T* have the same minimal polynomial. 
(iii) Let W be a subspace of V with dim W = r and define 


1973] CLASSROOM NOTES 1133 


Wr = {y*eV*| <y*, y> = 0 for all ye W}. 


Then W~ is a subspace of V* with dim W~ = dimV—r and W** = (W*)' = W. 
Furthermore, if W is T-invariant then W~ is T*-invariant. Consequently 


V = W,@ W, with W, T-invariant 
> V* = W@W; with W; T*-invariant. 
Of course the dual also holds, i.e., 
V* = W,* © W,* with W,* T*-invariant 
=> V= W,**@ W,** with W*+ T-invariant. 


(iv) A subspace W of V is said to be T-cyclic if there exists a vector yeW 
and a nonnegative integer r such that y, Ty,---,T”y form a basis for W. Thus for 
the vector y if the degree of m,(x) is k then y, Ty,---,T*"*y are linearly indepen- 
dent and the space W spanned by these k vectors is T-cyclic. 


3. Main Theorem. We are now in a position to prove our principal result. 


THEOREM: Let V be an n-dimensional vector space (n finite) and T a linear 
transformation on V. Then V is the direct sum of T-cyclic subspaces. 


Proof. Let k be the degree of the minimal polynomial] m(x) of T, and let y be 
a vector in V with m,(x) = m(x). Then the space W spanned by y,Ty,---,T*~*y 
is T-cyclic. We shall prove that if W #4 V(k # n) then there exists a T-invariant 
subspace W’ such that V = W@W’. Clearly, by induction on the dimension, W’ 
will then be the direct sum of T-cyclic subspaces and the proof complete. 

To show the existence of W’ enlarge the basis y, = y, yo = Ty,:,), = T* 'y 
of W to a basis yi, Yo,°°* Vas °° Vn OF Vand let yi, ys: yee ys be the dual basis. 
To simplify notation let y* = y;. Then 


<y*,y> =0 for 1 Sis k—-1 and <y*,y) = 1. 


Consider the space W* spanned by y*, T*y*,---, T**~*y*. Since m(x) is also the 
minimal polynomial of T* the space W* is T*-invariant. Now observe that if 
W* OW* = {0} and dimW* =k then V* = W*@W* where W* and W~ are 
T*-invariant (since dim W~ = n—k). This in turn implies (from iii) the desired 
decomposition V = W** @W** = W@W’ where W** = W and W** = W’ are 
T-invariant. 

‘Finally we shall prove that W* (1 W~ = {0} and dim W* = k simultaneously 
as follows. Suppose that aygy* + a,T*y* +--+» + a,T**y* CW~ where a, # 0 and 
O<s<k-—1. Then 


THE" (doy* + ayT*y* + + + aT y*) 
— a T** 1 Sy* + a, T**Sy* 4 wes + a,T**~ *y* 


1134 A. G. FADELL [December 


is in W~ since W~ is T*-invariant. Therefore 
((agT +78 + ay TS + +a, T!*)y*, yy) = 0. 
This implies (from 11) 
<y*,(agT 8 + ay Th 8 + +a.T**)y) 
= A Y*, Vers? +s Ven stad $i FAK YD = a, = O 
which is a contradiction. 


References 


. C. W. Curtis, Linear Algebra, Allyn and Bacon, Boston, 1968. 

. K. Hoffman and R. Kunze, Linear Algebra, Prentice Hall, Englewood Cliffs, N. J., 1971. 
. S. Lang, Algebra, Addison Wesley, Reading, Mass., 1965. 

. L. J. Page and J. D. Swift, Elements of Linear Algebra, Blaisdell, Waltham, Mass., 1961. 


a WN = 


A PROOF OF THE CHAIN RULE FOR DERIVATIVES IN 2-SPACE 
A. G. FADELL, State University of New York at Buffalo 


The usual proofs of the chain rule for derivatives in n-space utilize an e — 6 
argument by approximations. In the view of the author the following proof is of 
the kind the average student finds easiest to follow. We use the notation of R. G. 
Bartle’s The Elements of Real Analysis, Wiley, 1964, Sec. 20. 

We are given a function f with domain D(f) in R? and values in R’, and g a 
function with domain D(g) in R? and range in R’%, with g(c) = a, where a is an 
interior point of D(f) and c an interior point of D(g). 

Assume that f has a derivative L, at u = a and that g has a derivative L, at 
x = c. We show that the composition f° g has a derivative Lyo L, at x =. 
Let 


Ju-a| 


fu) — f(a) — Lu — a) 
(1) ou) = 1 » FG 


ua. 


Then @, is continuous at u = a, and in view of (1) for any ue D(f) 
(2) ° fu) —f@ = ¢u)|u-—a|+L,(u—a). 


Letting u = g(x), a = g(c) in (2) and subtracting L,|L,(x—c)] from both sides 
we have 


fla] -flgO] -L,[L,@ — 0] = ¢s[9@)]|9@) — 9 |+ L,[9@) — 9] -L,LL,(x-9)]. 
Dividing both sides by | x — c| and using the linearity of L, we obtain 


1973] MATHEMATICAL EDUCATION 1135 


flga>)—-flg@I-Lyl L(x oI o,f gx IMAI) , L, EE — 9(¢) ee] 

Ix — | Ix —c| Jx—c| 
Finally, the right side has limit 0 as x > c since $,[g(x)] > ¢,(g(c)) = ¢,(a) = 0, 
| g(x) — g(c){/| x — | is locally bounded, and the last term has limit L,(0) = 0, 


since g is diiferentiable at c. Thus, by definition Ly o L, is the derivative of f o g. 


—— 


MATHEMATICAL EDUCATION 
EpDITeD BY J. G. HARVEY AND M. W. POWNALL 


Material for this Department should be sent to Shirley Hill, Department of Mathematics, 
University of Missouri, Kansas City, MO 64110, or to Paul Mielke, Department of Mathe- 
matics, Wabash College, Crawfordsville, IN 47933. 


SURVIVAL OF THE TWO-YEAR COLLEGE MATHEMATICS TEACHER 
P. A. LINDstromM, Genesee Community College, Batavia, New York 


In “Survival kit for the college mathematician” [4] and “Survival for mathemati- 
Clans or mathematics ?’”’ [6] both Professors Flanders and Peterson discuss the future 
of teaching and research at the college and university levels. Flanders believes that 
college teachers of mathematics have a professional obligation to survive as mathe- 
maticians. Peterson believes that the problem of survival affects not only the level of 
college mathematicians, but all levels of the academic mathematical system. 

How then is the teacher of mathematics at the two-year college to survive? Many 
people would say that since the two-year college faculties are highly student-oriented, 
more so than the discipline-oriented faculties of the four-year colleges and univer- 
sities, then the two-year college teacher of mathematics can survive by being a ‘“‘good 
teacher.” It is not the purpose of this paper to dwell upon his survival by being a 
‘‘sood teacher.’’ Instead, there are two other areas of survival that affect him and that 
are Closely related to that of being a “‘good teacher.” These are the overlapping areas 
of 1. professional obligations and 2. professional identity. 

Teaching his classes, advising students, serving on committees, etc. are but some 
of the obligations of the two-year college mathematics teacher. With regard to his 
profession he also has many obligations so that he can survive as a mathematician 
and a teacher. Christie and Wells [1], Flanders [4], and various CUPM publications 
[2 and 3], suggest many ways that are applicable to the two-year college mathematics 
teacher. Some of these are: 

1. Scan and read textbooks and journals to keep up to date with mathematics 
and to find new and interesting problems and material for the classroom. 

2. Organize a mathematics discussion group to discuss not only possible new 
courses, teaching techniques, textbook selection, etc., but also journal articles, mathe- 


PROBLEMS AND SOLUTIONS 


EDITED BY Emory P. STARKE 


ASSOCIATE EDITORS: JOSHUA BARLAZ, ERIC S. LANGFORD. COLLABORATING EDITORS: LEONARD 
CARLITZ, GULBANK D. CHAKERIAN, HASKELL COHEN, S. ASHBY FOOTE, ISRAEL N. HERSTEIN, 
MurRRAY S. KLAMKIN, DANIEL J. KLEITMAN, ROGER C. LYNDON, MARVIN MARCUS, CHRISTOPH 
NEUGEBAUER, ALBERT WILANSKY, AND UNIVERSITY OF MAINE PROBLEMS GROUP: EARL M. L. 
BEARD, GEORGE S. CUNNINGHAM, CLAYTON W. DODGE, OSKAR FEICHTINGER, WILLIAM R. 
GEIGER, RAMESH GUPTA, GARY HAGGARD, PHitip M. LOCKE, JOHN C. MAIRHUBER, CURTIS 
S. Morse, GRATTAN P. MURPHY, EDWARD S. NORTHAM AND WILLIAM L. SOULE, JR. 


All problems (both elementary and advanced) proposed for inclusion in this Department should 
to sent to E. P. Starke, 1000 Kensington Ave. Plainfield, NJ 07060. Proposers of problems 
are urged to enclose any solutions or information that will assist the editors. Ordinarily, prob- 
lems in well-known textbooks and results in generally accessible sources are not appropriate 
for this Department. No solutions (except those accompanying proposals) should be sent to 
Professor Starke. 


ELEMENTARY PROBLEMS 


Solutions of Elementary Problems should be sent to Problems Group, Mathematics Department, 
University of Maine, Orono, ME 04473. To facilitate their consideration, solutions of Elemen- 
tary Problems in this issue should be typed (with double spacing) and should be mailed before 
March 31, 1974. 


E 2444. Proposed by Ray Redheffer, University of California, Los Angeles 


Let S be an open connected subset of real Euclidean space R" and suppose that 
f:S— R” is differentiable. Let the Jacobian matrix Df(x) at xeES satisfy 


| Df) || s o[F@)| 


for some constant o and all xe S, where the norm of a matrix is the sum of the 
absolute values of its entries. Show that if x,, x, €S can be connected by a path of 
length d lying wholly within S, then 


Fr) || S |] fa) lle 


(Note that a consequence of this is that f cannot vanish anywhere on S unless it 
vanishes everywhere on S.) 


E2445. Proposed by F. Leuenberger, Feldmeilen, Switzerland 

Let P be a point in the interior of a triangle ABC. Let R,,R,,R 3 denote the 
distances from P to the vertices of ABC and let r,,r,,r3; denote the perpendicular 
distances from P to the sides of ABC. Show that 


ro +13 1 R, 
r2+2R,+17r3; — 3° ro + £3 


— 
IA 


with equality if and only if the triangle is equilateral and P is its center. 


1138 


ELEMENTARY PROBLEMS AND SOLUTIONS 1139 
E2446. Proposed by H. D. Ruderman, Hunter College High School 


Characterize those moduli m for which both x* = 1 (mod m) implies x = 1 (mod m) 
and x° = 0 (modm) implies x = 0 (modm); show that these are precisely the 
moduli for which x* = y* (modm) implies x = y (modm) for all x,y. 


E2447. Proposed by E. T. H. Wang, University of Waterloo, Canada 


A k-satisfactory sequence is a k-tuple S = (a,,a,,---,a,) of natural numbers 
with a, Sa, S++» S a such that La; = []a,. Let v(S) denote this common 


value. Show “that v(S) S 2k with equality if and only if S = (1,---,1,2,k), and 
investigate the problem of finding a lower bound for v(S). (Cf. E2262 [1971, 1021].) 


E 2448. Proposed by Gérard Letac, Université de Clermont, France 


Find all positive semi-definite Hermitian matrices A = (a;,) with the property 
that the matrix of reciprocals (1/a;;) is also positive semi-definite. 


E2449. Proposed by Frank Siwiec, John Jay College 


Let f be a continuous mapping of R onto R with the property that for every 
yeR, the boundary of the set f-*(y) = {x e R: f(x) = y} is compact. Show that f 
is a closed mapping. 


SOLUTIONS OF ELEMENTARY PROBLEMS 
A Curious Summation Inequality 


E 2373 [1972, 905]. Proposed by Grahame Bennett, Indiana University 


Let 71, rz, °+*, 7, be real numbers. Show that there exists a subset N of {1, 2, ---, n}, 
neither containing nor omitting three consecutive integers, such that 


1 2 
ur; = 5 Ini. 


Show further that 1/6 is the best possible constant here. 
Establish the corresponding result (with 1/6 replaced by 1/32) for complex 
numbers. 


Solution by L. E. Mattics, University of South Alabama and C. S. Gardner, 
University of Texas (independently). Let & |r; | = s and for i=0,1,2, let 
pi) = & {r;:r; 2 Oandj = i(mod3)}andn(@i) = Lf{r;:r; <Oandj = i(mod 3)}. 
There exist distinct i, i’, such that either p(i) + p(i’) 2 s/3 or such that n(i) + 
n(i’) S —s/3; we assume without loss of generality that the first case holds. 
Now if p(i) + pi’) 2 —n(i) — ni’) then 2(p(i) + p(i’)) + n(@i) + ni’) 2 5/3, so 
that either p(i)+ p(i’!)+n@ 2 s/6 or p(i)+ pli’) +n’) 2 8/6. Similarly if 


1140 ELEMENTARY PROBLEMS AND SOLUTIONS [December 


p(i) + pti’) S —n(i) — n(i’) it follows that either —p(i) — n(i) — n(i’) 2 s/6 or 
— p(i’) — n(i) — n(i') = s/6. This establishes the inequality. To show that the 
constant 1/6 is best possible, take ry; =r, = rz = 1 andry=rz,=re = —-1. 

Now suppose that complex numbers 2Z,,---,Z, are given. Write z, = r,exp(i0,) 
and set | z,| = Lr,;=s. Let FO) = Xr; cos(0; — 9)|; an elementary com- 
putation shows that 


2% 
[ F(0)d0 = 4s, 
0 


so by the Mean Value Theorem, there exists 0, such that F(0)) = 2s/x. Applying 
the inequality derived above for real numbers, we see that there exists a subset 
N & {1,2,---,n} of the desired type such that 


xX r;cos(O; — Oo) 


JEN 


1 S 
> — _- _ 
= 6 (0) 37° 


From this, it follows that 


S 


— < X r,;cos(d,— 99) | = » Re(r ,e(?5~%) 
37 JEN JEN 
_ | Re(e™ DY z)] < fe Yaz} =| Dz, 
JEN JEN JEN 


establishing the proof of the inequality in the complex case. To show that 1/32 
is actually best possible, let n = 12m and consider the sequence @,,@,,---,@, of 
the nth roots of unity, where w, = exp(i0;) and 0; = 2nj/n. Let N & {1,2,-+-,n} 
be any subset of the specified type. Write @ = %,.yw, and let 0 = argw. Note 
that 


x cos(6; — 8) | = | DY Re(ei-®) | 
jJeEeNn JeNn 
= | Ree” > o)| = | Re(e~” | w| e*) 
JEN 
jen 


Now partition the unit circle into 4m half-open subintervals I,,I,,---, lan; 
where 


T kn 
= : = —_ —_—_ <— 
L, fa 1 and (k—-1)>— <argz S im 
Note that by assumption on N, the number of @, with j¢.N which lie in any I, 


is either one or two—never zero or three (or more). Moreover, if we translate the 
@,mod2x by considering w; = ew, then the same will hold true for the @; 


1973] ELEMENTARY PROBLEMS AND SOLUTIONS 1141 


except for a possible “‘edge effect’’ which is negligible for large n. (More precisely, 
one (and only one) [,, could contain zero or three w; because of the fact that possibly 
N contains (or excludes) n, 1, and 2 or n—1, n, and 1.) Now 


x a; 


jeEeN 


» cos(6, — 0)| = | x1 cos(6; — 0) + X, cos(0,—6)|, 
where the first sum is over all je N with cos(#; — 0) >0 and the second over all 
jeéN with cos(@; —0)<0; that is, the first sum is over all jeN such that 
w,E€1, + Oly UI 3m41 Ue Ul, and the second over all je@N such that 
O; Elina, U-ss Ul5m. (Those 6; such that cos(@; — 6) = 0 can be ignored.) Assume 
without loss of generality that 2,+ 2,20. Using the fact that for large n, 
the sums can be approximated by integrals, and the facts that in every subinterval 
in 2%, there are at most two w’ and that in every subinterval in L, there is 
at least one w'; we see that 


Xo, 


JEN 


= 2, cos(@, — 0) + X,cos(9,; — 9) 


n/2 3n/2 
(=) | 200s 0'd0" + (=) | cos 0'd0’ + o(1) 


1 ~n/2 n/2 


IA 


4m n 
qe TO = 35 


1 n 
+o(1) = = x | o;| + o(1). 
This shows that 1/32 is best possible in the complex case. 


Also solved by the proposer. 

Editor’s comment. Related inequalities appear in the literature, but without the added feature of 
“neither containing nor omitting three consecutive integers.’ The proposer refers to the following 
inequality 


(*) Xz; 


= 
~ 7 


for complex numbers which can be found in Bourbaki, General Topology (part 2), Addison-Wesley, 
1966, Chap. VIII, Ex, 1, § 3, p. 126. Bourbaki notes that the constant 1/7 is best possible, but that it 
cannot be achieved. This leads one to believe that the constant 1/3z for our problem, although best 
possible, cannot be achieved as can the constant 1/6 for the real case. Edwin Klein calls attention to 
the inequality (*) with the constant 1/6 (rather than 1/7) which is derived in Rudin, Real and Complex 
Analysis, McGraw-Hill, New York, 1966, p. 119. Rudin’s argument is simpler than Bourbaki’s 
which is to be expected since his constant is less precise. 


A Birthday Problem 
E 2386 [ 1972, 1134]. Proposed by William Knight, University of New Brunswick 


The classical birthday problem can be phrased as a bet between a statistics teacher 


1142 ELEMENTARY PROBLEMS AND SOLUTIONS [December 


and a class of n < 365 students, the teacher betting that at least two students have 
the same birthday. (The usual stake is one-up-ness rather than money.) If birthdays 
are (1) independently and (2) uniformly distributed over the 365 days of the year 
(leap years being ignored) the probability of the teacher’s winning is 1 — (365), /365” 
where (m), denotes the partial factorial m!/(m—n)!. But it is more likely that 
birthdays are not really equally numerous at all seasons. Show that this, in fact, 
makes the bet more favorable for the teacher; that is, if assumption (2) is dropped, 
1 — (365), /365" is a lower bound attained only when all days are equally probable 
as birthdays. 


Solution by D. M. Bloom, Brooklyn College. The assumption n 2 2 is clearly 
intended; also, we may assume by induction that the result is true for all “‘years’’ of 
fewer than 365 days (the result being trivial for a year of just one day). Let x; be the 
probability of a birthday occurring on the ith day of the year; then the probability 
that the teacher loses is 

Poss s%36s) = (0) ED (TL x] 
S ieS 
where S runs over all n-element subsets of {1,---,365}. Since P is continuous, the 
maximum of P on the closed set {x; = 0 (all i), Ux; = 1} exists, and it cannot occur 
when any x; is zero (by the induction hypothesis and the fact that (364),/364” 
< (365),,/365"); hence the method of Lagrange multipliers is applicable (with 
& x; = 1 as the side condition) and implies that 0P/0x,; = O0P/0x, for alli, j at 
the point in question. Therefore 


(*) 0 = OP /0x; — OP /0x; = (n!)(x; — X,) >> (11 v1) 

T keT 
where T runs over all (n — 2)-element subsets of {1, ---,365} which contain neither i 
nor j. Since n = 2, the summation in (*) is nonempty and hence nonzero, so that 
X;— X,;=0, x; =x, (all i,j) which is the desired result. 


Also solved by Ellen Hertz, Harry Lass, Carolyn MacDonald, William Nuesslein, G.S. Rogers & 
D. L. Young, Michael Shimshoni (Israel), and the proposer. 


Editorial Note. Various versions of the birthday problem have been dealt with in the literature. 
The reader is referred to articles on this subject in the American Statistician, Feb. 1968, April 1968, 
Feb. 1970, June 1972. 


A Pair of Triangle Inequalities 
E 2388 [1972, 1135]. Proposed by A. W. Walker, Toronto, Canada 


Let a,b,c; s,r,R,I,H denote the side lengths, semiperimeter, inradius, circum- 
radius, incenter and orthocenter of a triangle ABC. 
(i) For ABC arbitrary, prove that 


1973] ELEMENTARY PROBLEMS AND SOLUTIONS 1143 


be +ca+ab2=(AI+BI + CI)? 


with equality if and only if the triangle is equilateral. 
(ii) For ABC non-obtuse, prove that s* = 2R? + 8Rr + 3r* or, equivalently, 


a? + b* +c? > (AH + BH + CH)’, 
with equality if and only if ABC is equilateral or right isosceles. 


Solution by M. G. Greening, University of New South Wales, Australia. 
(i) Dab — (LAI? = 4R?[ Vsin Asin B 


— 4(Ysin? 4A sin? 4B + 2sintAsin$ Bsin4C Ysin}A)] 
16R?[3sin4 Asin} Bsin4 C — 2sin4 Asin4 Bsin$} C Ysin}A] 
4Rr(3 — 2 Lsin4$A) 2 0, (See [1], 2.10.) 


with equality holding only for ABC equilateral. 
(ii) Seta =2—2A, B =x — 2B, y =x —2C. Then 


La? — (LAH) = 4R*( Xicos? 4a — [ Vsin4a]?) = 0 


and this is (ii). See [2]. 

If A=4n, then Xcos? 4a = 2 and (XLsin4«)? = 1 + sin B, so that equality holds 
only for 48 = tn = }4y. 

Otherwise (11) depends on the inequalities 

(iii) (sinda — 4) (sin $6 — 4) 2 0 and 

(iv) 2 sind fsindy + singa < 1 
which is equivalent to cos#(f — y) <1, so that equality holds only for ABC 


equilateral. 
The resulting inequality s* = 2R* + 8Rr+3r? is stronger than either s?2 


3r(4R +r) or s? 2 (16R — Sr)r mentioned in [1]. 


[1] Bottema et al., Geometric Inequalities. 

[2] Problem E 1272, this MONTHLY, 67 (1960) 693-694. 

Also solved by Anders Bager (Denmark), Leon Bankoff, A. G. Ferrer (Mexico), C. S. Gardner, 
Leon Gerber, Leonard Goldstone, M.S. Klamkin, and the proposer. 


Six Equal Regions? Yes. Seven? No. 


E’2391 [1973, 74]. Proposed by V. R. R. Uppuluri, Oak Ridge National 
Laboratory 


It is well known that three chords can divide a circular disk into at most seven 
pieces. Can these seven pieces all have the same area? 


I. Solution by the Rose-Hulman Problems Group. The answer is no. Each of the 


1144 ELEMENTARY PROBLEMS AND SOLUTIONS [December 


three chords must intersect the other two and divide the disk into two sections with 
areas of ratio four to three. The envelope of all such chords is a circle. We construct 
the figure below. 


C' Cc” 


Let A”B’ divide the disk into a ratio of four to three. If C’B” is also such a chord, 
one easily verifies that the angle A”BC’ is acute. By symmetry, the construction of 
the final chord, A’C”, yields an equilateral triangle, ABC. After rotating through 2 
radians we find that the image of the region CC”C’ (broken lines) is properly contained 
in the region A’B”BA and therefore CC’C’ and A’B”BA cannot have equal areas. 


II. Solution by D. W. Atkinson, University of Nebraska at Omaha. The answer 
is no. Assume a solution does exist. It is easy to show that this solution must be 
symmetrical; i.e., all chords intersect in 60° angles and are equidistant from the 
center of the circle. Let s be the distance from the center to each chord. The area of 
the central equilateral triangle is 3s?./3. In a unit circle we have 3s?,/3 = 2/7 or 
s = [z/(21 ,/3)]*. This should also be the area of one of the “‘triangle-like’’ pieces 
(AA'A” in the figure). Construct a tangent to the circle and extend the two chords 
forming the “‘triangle-like’’ piece to form an equilateral triangle (APO in the figure) 
which contains the “‘triangle-like’’ piece. It should have area greater than 7/7. 
However, its height is 1 — 2s, and so its area is (1 — 2s)?/.,/3. Substituting for s 
shows that this area is less than 2/7. Thus no solution exists. 


Ill. Solution by V. Linis, University of Ottawa. The answer is no. In the seven 
piece configuration each of the three chords divides the area of the disk in the ratio 
3: 4. Such a chord subtends a central angle « which satisfies the equation 


(*) a — sina = 62/7 


and has distance d =r cos3a from the center (r = radius of the disk). If we let 
a = 7% — 2f the equation (*) becomes 2f + sin2f8 = 2/7 which has an approximate 
solution BP = 72/28 since B is rather small. Then d=r sin 8B = xr/28. The central 
piece is a triangle with d as inradius, therefore its semiperimeter s satisfies the equation 
sd = nr? /7. It follows that s = 4r, which is plainly impossible. 


IV. Comment by M. S. Klamkin, Ford Motor Company (similar comment by 
R. C. Buck, University of Wisconsin at Madison). The answer is negative even if the 


1973] ELEMENTARY PROBLEMS AND SOLUTIONS 1145 


circular region is replaced by aconvex region; see R. C. Buckand E. F. Buck, Equipar- 
tition of convex sets, Math. Mag., 22 (1949) 195-198, where it is shown that at most 
six of the regions can have the same area and that these equal regions must be the 
six outer ones. 


Also solved by R. P. A’Hern (England), Peter Avery (England), Anders Bager (Denmark), 
Merrill Barnebey, C. C. Clever & K. L. Yocom, R. B. Eggleton, Arthur Gittleman, Michael Gold- 
berg, S. H. Greene, Ralph Jones, L. L. Keener, Dan Kenway & Rici Liknaitzky, P. G. Kirmser, 
Lew Kowarski, O. P. Lossers (Netherlands), Carolyn MacDonald, Carl Maltz, Greg Maxwell, M. D. 
Meyerson, Larry Olson, C. C. Oursler, W. W. Parsons, D. B. Price, E. S. Rosenthal, Ralph Seifert, 
Jr., Phil Tracy, G. Tsintsifas (Greece), J. H. Wahab, K. G. Willett, and J. N. Younglove. 


Volume of a Simplex 


E2393 [1973, 75]. Proposed by M. S. Klamkin, Ford Motor Company 


Parallel lines are drawn through the vertices Ay, A,,°::,A, of a given simplex 
of volume V, terminating in the opposite faces (extended if necessary) in the points 
By, B,,-::,B,, respectively. 

(1) Show that the volume of the simplex determined by Bo, B,,---,B, is nV. 

(2) Show that the volume of the simplex determined by the vertices 
Ao, Ay, °**, A,» Bra 1, By 425°, B, is given by Vi = |n—r—1|V. 


Solution by Leon Gerber, St. John’s University. Parallel lines are drawn through 
the vertices Ay, A,,°-:,A,, etc. Let the weights of the point P with respect to the given 
simplex be (p,) where Lj.) p; = 8, with s = 1 if P is a proper point, and s = 0 
if P is improper. Then the cevians A,;P (which are parallel if P is improper) meet 
the face opposite A, in B; = (b,;;) where b, = 0 and b;; = p,/(s — p,). Thus the 
ratio of the content of A,---A,_,B,---B, to that of the given simplex is 


det 0 1 0 0 
by o ° by p14 ber ° ben 
bn o | b, Ban 
0 te Pnl(S — P,) 

= det 
P,|(S — Py) —_ 0 


(r—n) II D;/(p; — 8). 


1146 ADVANCED PROBLEMS AND SOLUTIONS [December 


The absolute value is n—rifs =0O. 


Also solved by G. Tsintsifas (Greece), and the proposer. M. G. Greening (Australia) submitted 
a very clear solution to the first part. 


A Combinatorial Problem 


E 2395 [1973, 75]. Proposed by H. W. Gould, West Virginia University 


Let n be a nonnegative integer, For p = 1,2,--- define 


Aid = EO Ui) (ep 


where we make the usual conventions regarding binomial coefficients. Prove that, 
whenever n is odd, A,(n) = nA,(n). 


Solution by the St. Olaf Problem Group. Let n=2m-+1. Equating the coef- 
ficients of x?" in (1 + x)"(1 — x)" =(1 — x”)" and simplifying, one finds that A,(n) 
=(— 1)"("). Also if we note that (") = ("7") + (72}), it follows that 


n—1 n—1 (—1)"/n 
A,(n) =(— 1)” — = ; 
= 9") = (mat) = SS (n) 
Hence A,(n) = nA,(n). 
Also solved by M. T. Bird, D. M. Bloom, Robert Breusch, R. A. Gibbs & H.S. Stocker, Elliot 


Goldstein & Robert Spira, M.G. Greening (Australia), O.P. Lossers (Netherlands), Joseph 
O’Rourke, Phil Tracy, and David Zeitlin. 


Editorial Comment. The result is not true if 2 is even, since in that case A; (4) = Oand A, (4) ¥ 0. 
It would be of interest to find some kind of recurrence relation between Ap(n) and Ap—1 (”) valid for 
any p. 


ADVANCED PROBLEMS 


All solutions of Advanced Problems should be sent to J. Barlaz, Rutgers — The State University, 
New Brunswick, N. J.08903. Solutions of Advanced Problems in this issue should be typed (with 
double spacing) on separate, signed sheets and should be mailed before March 31, 1974. 


An asterisk (*) means neither the proposer nor the editors supplied a solution. 


5940. Proposed by Donald Minassian, Indianapolis, Indiana 


Let R be a commutative ring with 1 and let I be an ideal of R. On p. 131 of 
Introduction to Modern Algebra, (D.C. Heath, 1963) W. Barnes claims J is noetherian 
(i.e., as a ring which may lack a unit, and defining the ideal generated by a subset T 
of a ring as the smallest ideal containing T). Prove or give a counterexample. 


5941. Proposed by Jan Mycielski, University of Colorado 


Prove (without using the axiom of choice) that R/Q is of the same cardinality as 
B/F, where Ris the additive group of real numbers, Q is the additive group of rational 


1973] ADVANCED PROBLEMS AND SOLUTIONS 1147 


numbers, B is the Boolean algebra of all subsets of the set of integers, and F is the 
ideal of finite sets of integers. 


5942*. Proposed by D. M. Bloom, Brooklyn College 


Let X,, X2, X3 be independent random variables such that E(X,) > E(X,) 
> E(X,;). Assume that the X, have normal distributions with a common variance. 
Prove or disprove: if P(X, > X,) = K and P(X, > X3) = L, then P(X, > X,) is 
greater than M where M is defined by 


K L M 


T-K i-i~i-m 
5943. Proposed by L.J. Wallen, University of Hawaii 
Let V be a vector space over some field and let L(V) denote the algebra of all 
endomorphisms of V. A set dc L(V) is transitive if 6x = V for each xe V, x £0, 
Let Q be a transitive subalgebra of L(V). Determine the automorphisms « of Q having 
the property that whenever gc Q is transitive, so is a(@). 
5944. Proposed by L. J. Wallen, University of Hawaii 


Let H bea separable, complex, infinite-dimensional Hilbert space. A venerable 
theorem of Halmos states that every contraction is the weak limit of a sequence of 
unitaries. What is the weak sequential closure of the class of operators similar to 


unitaries? 
5945. Proposed by R. Sivaramakrishnan, Engineering College, Trichur, India 


Kesava Menon defines the norm f*(n) of a multiplicative function f(n) by 
f*(n) = ra f(n* [AAAS A), 


in which A(n) = ( — 1)” where k(n) represents the total number of prime factors 
of n, each being counted according to its multiplicity. 

Characterize the class of multiplicative arithmetic functions f(n) which satisfy 
f*(n) = [1/n], [x] being the integral part of x. 


SOLUTIONS OF ADVANCED PROBLEMS 


Partitions with Even Minimal Part 


, 5865 [1972, 668]. Proposed by G. E. Andrews, Pennsylvania State University 


Let Q, denote the set of partitions of n into distinct non-negative parts with an 
even number as the smallest part. Let q,(n) (resp. q,(n)) denote the number of elements 
of Q, that have an even number (resp. odd number) of even parts. Prove that 


1 if n is a square 


a(n) — a.(n) = | 


0 otherwise. 


1148 ADVANCED PROBLEMS AND SOLUTIONS [December 


Solution by Allen Stenger, Student, Pennsylvania State University. The 
generating series for q,(n) — q(n) 1s 


ius 


x?"(1 4 x2nt ) (1 _ x2"*2) (4 4 x2"t)... 


00 


=[] (a-x") +2" )} 


n=1 
1+ x” + x" + 
(d+x)d—x7) (+x) — x?) + x3) — x4) 
The series inside the braces is 


I - ! ~ 2n-1 
Ll, @aenaeeey tH ater | 
To prove this, use the known identity 


ax a*x? °° 1 
() l+i— + aepaep tT a 
add the results of putting a = | and a = — 1, divide by 2, note that 


co ) 1 io.) 1— x? Cc 
Wa 


1— 2n—1 
1+ x" eel ( x )s 


and replace x by —x. Hence our generating series is 


0c io.) 0c 

$44 T] (4-29 +0} = 444 ED xm = T xm 
n=1 m=— © m=0 

wnere the next-to-last equality comes from putting z = 1 in Jacobi’s identity 


(2) il fd. — x?")(1 + zx) + zt x2t hh = 


°° 2 
ys» zy , 
n=1 m=-0 


This is the desired result: the coefficient of x” is 1 1f n is a square, and zero other- 
wise. (The identities (1) and (2) may be found in Hardy and Wright, Theory of 
Numbers.) 


Also solved by L. Carlitz, M. G. Greening (Australia), Phil Tracy, and the proposer. 


4 


Real Copositive Quadratic Forms 


5867 [1972, 780]. Proposed by D. E. Daykin, Reading University, England 
Let Q be the real quadratic form 


1973] ADVANCED PROBLEMS AND SOLUTIONS 1149 


How can we ensure that Q = 0 whenever all x; 2 0? 


Partial solution by R. D. Leitch, Royal Military College of Science, Shriven- 
ham, England. We shall consider the general case and find a necessary and sufficient 
condition for the quadratic form 


n 
Q= LX ajyx;x;, a;; = a;, 
ij=l 

to be non-negative when all x;2=0. We designate the region where a; = 0 as the 
first quadrant. Since Q is homogeneous we need only consider its values on S"~!: 
{x;| Lx7 = 1}. Let U,, be that portion of S*~* lying in the first quadrant. We shall 
be considering eigenvectors of various symmetric matrices and shall say that an 
eigenvector is negative if its associated eigenvalue is negative. We need the following 
lemma. 


LemMMA. Let Q be the quadratic form & a,,;x;x;, and let A =(a,,). Then, if 
D is a closed subset of S"~*, such that Q restricted to the boundary of D is non- 
negative, Q takes negative values in the interior of D if there is a negative eigen- 


value of A in D. 


This is proved by considering the local minima of Q on S"~', using partial 
differentiation and Lagrange’s multipliers. 

We shall need to consider the submatrices of A lying along the main diagonal. 
Let A(r,,--:,7,) be that submatrix of A formed by the r,th, r,th, --- r,th, rows and 
columns of A. Let Q(7,,---,1,,) be the quadratic form defined by A(r,,---,7,). Observe 
that O(r,,--:,7r,) is Q restricted to the r,th, r,th, ---,7r,th coordinates of (x,,--:,x,), 
the other coordinates being zero. 


THEOREM. If Q is non-negative in the first quadrant, then 

1. Qij =0,i=1,---,n, 

2. A and A(r,,-::,1,) have no negative eigenvectors in the first quadrant, for ev- 
ery possible (ry,--:,1,). 


Proof. By induction on n. Let n=2, and consider the quadratic form Q 
= ax* + 2bxy + cy” and matrix 
Ld 
b cl 


Clearly, if either or both of a,c are negative, Q is negative along one or both of 
the coordinate axes. Applying the lemma with D = U,, we have the theorem when 
n= 2. 

Suppose we have the theorem up to n — 1, and that the conditions of the theorem 
hold. Then Q(i) is non-negative in the first quadrant for i = 1,2,---,n, and in parti- 


1150 ADVANCED PROBLEMS AND SOLUTIONS [December 


cular, Q restricted to; the: boundary of U,, is non-negative. Applying the lemma gives 
us the theorem. 


Editorial comments. (1) Eric Langford notes that the question has been investigated in J. W. 
Gaddum, Linear inequalities and quadratic forms, Pacific J. Math. 8 (1958), 411-414. 

(2) Thomas Markham points out that a characterization of copositive quadratic forms attributed 
to Garsia and Baumert appears as Theorem 4.1. in the paper, On classes of copositive matrices, by R. 
W. Cottle, G. J. Habetler and C. E. Lemke, Linear algebra and its applications, 3 (1970), 295-310. 

(3) Milan Lustig (The Technical University, Brno, Czechoslovakia) offers the following sufficient 
condition for n = 4: (i) a;, 20; (ii) For all i, 7 = 1, 2, 3, 4, there exist Piz 0 such that (a) p;;=0, 
(b) Pik= 1,i = 1, 2, 3, 4; (c) a; 2 0 or a < Dj ;P j49444 -fori #/j. Forn = 4, Lustig offers 


iii Ji 5 


the following necessary condition: (i) a;, 20, (ii) a; j 2 0ora; Ss a; 4;; fori ~ j. 


(4) The proposer also suggests the more general problem in which the conditions x, 2 0 are 
replaced by ae b,x; 20,k =1,2,+++,(m Sn). 


Hereditarily Normal Stone-Cech Compactificators 


5870 [1972, 780]. Proposed by D. J. Lutzer and F. G. Slaughter, Jr., University 
of Pittsburgh 


For which discrete spaces D is BD hereditarily normal? (8D denotes the Stone- 
Cech compactification of D.) 


Solution by A. A. Jagers, Technische Hogeschool Twente, Enschede, Nether- 
lands. If D is finite then D is compact and BD coincides with the discrete space D. 
On the other hand, if D is infinite, BD contains a homeomorphic copy of BN where N 
is the discrete space of normal numbers, and BN contains a subspace X which is not 
normal (cf. example 3 on p. 133 of R. Engelking, Outline of General Topology, 
(1968)). Hence BD is hereditarily normal if and only i -D is finite 


Also solved by R. Dyckhoff (England), Melvin Henriksen, J. H. Weston, Albert Wilansky, 
and the proposer. 


The Equation 0f/0x = Of/dy 


5871 [1972, 780]. Proposed by P.R. Chernoff, University of California, 
Berkeley 


Let f(x, y) be a real-valued function of two real variables which is separately 
differentiable. Assume that of /dx = of/dy everywhere. Must there be a function g 
of gne variable such that f(x, y) = g(x + y)? What if we assume a priori that f 
is jointly continuous? 


Solution by K. F. Andersen, University of Alberta. The answer is yes; in fact 
the domain D of f need not be the entire plane. The conclusion holds provided D 
has the property that the intersection of D with every line of slope — 1 is a connected 
set. If D is such a set, let D* = {(x, y): (x, y — x) D} and put h(x, y) =f(x, y — x) 


1973] ADVANCED PROBLEMS AND SOLUTIONS 1151 


for (x, y)€ D*. Then, for each fixed y, {x: (x, y) € D*} is connected, and since 


oh 
By OY) = fi(x,y —-x) +(—-D f(x,y —x) = 0 


h(x, y) = g(y) is independent of x. Hence, f(x,y) =h(x,x + y)=g(x + y) for all 
(x,y) ED. 


Generators for a Function Ring 
5873 [1972, 913]. Proposed by Helge Tverberg, University of Bergen, Norway 


Those real polynomials in x and the greatest integer function [x | which are contin- 
uous functions of x form aring A, containing R. Find the minimal set of generators, 
over R, of A. 


Solution by D. Z. Djokovic, University of Waterloo. Let f(x) =x, g(x) = [x] 
for xe R and h =f —g —4. The functions h? and h(h* — 4) are continuous and if P 
is a polynomial in two variables we can write 


P(f,9) = PUFFS — h — 4) = OF, h) = QFN?) + hOF 1’) 
= Q,(f,h*) + h(h* — 4)03, h°) + ha, h’) 


where Q,, Q,, 03,Q, are suitable polynomials. It follows that P(f,g) is continuous 
if and only if 0, = 0. Thus A 1s generated over R by the functions 


f,h? and h(h? — 3). 


We claim that A cannot be generated by two elements over R. If this were so then A 
would be isomorphic to the polynomial algebra R[X, Y] since f and h? are algeb- 
raically independent over R and A is an integral domain. Thus A would be a unique 
factorization domain but if y= h?, z = h(h? — 4) then 


(1) 2° = y(y — 4)’. 
We have A=R[f,y,z] and one can see easily that the elements z, y, y—4 
are irreducible in A. Indeed, Ac R[ f,h] and every factorization in A would give a 


factorization in R[ f,h]| and these are all known for the elements z, y, y — 4. Then 
(1) shows that A is not a unique factorization domain, which is a contradiction. 


Also solved by the proposer and I. Beck (Norway). 


INDEX TO VOLUME 80, 1973 
THE AMERICAN MATHEMATICAL MONTHLY 


Author Index woe ee 
Key Words and Phrases Index 
Problems and Solutions Index 
Reviews Index . 

News and Notices Index . 

MAA and its Sections Index 


1187 
1190 
1194 
1196 
1217 
1218 


AUTHOR INDEX 


AICHELE DB Training secondary mathematics 
teachers in Venezuela 798-803 

ALAS OT On set points of discontinuity 186-187 

ALTER RONALD Can @(n) properly divide n-~1? 
192-193 

ApostoL. TM Another elementary proof of 
Euler’s formula for ¢(2”) 425-431 

ASSOCIATION FOR WOMEN IN MATH Remarks on 
“Women in mathematics” 903-904 

AuLT JC AND WATTERS JF Circle groups of 
nilpotent rings 48-52 

Award for Distinguished Service to Professor 
Raymond L Wilder 117-119 

Award of the 1973 Chauvenet Prize to Pro- 
fessor Carl D Olds 120 

BaRKER GP Topological properties of the row 
echelon form 787-789 

BARNES CW Remarks on the Bessel poly- 
nomials 1034-1040 

BENDER EA Teaching applicable mathematics 
302-307 

BENDER EA AND NEUWIRTH LP Traffic flow: 
Laplace transforms 417-423 

BILLINGSLEY PATRICK Prime numbers and 
Brownian motion 1099-1115 

BIRKHOFF GARRETT Current trends in algebra 
760-782 

BIRNBAUM S The main crises 545-546 

Boas RP and POLLARD H Continuous analogues 
of series 18-25 

BoOLKER ED The spinor spanner 977-984 

BRAWLEY JV and Car.irz L A characterization 
of the m X n matrices over a finite field 
670-672 

and An addendum to the paper 

‘‘A characterization of the m X< n matrices 

over a finite field” 1041-1043 


BROWNE JB See Eisenberg TA 

BRUCKNER AM The differentiability properties 
of typical functions in C [a, b] 679-683 

BuRROW MD The minimal polynomial of a 
linear transformation 1129-1131 

CALLAHAN FP An identity satisfied by derivations 
of a purely inseparable field 40-42 

CAMERON DE The mini-max property of the 
Tychonoff product topology 925-927 

CARLITZ L and SCOVILLE RICHARD The sign of 
the Bernoulli and Euler numbers 548-549 

CARLITZ L Inequalities for the area of two tri- 
angles 910-911 

CaRuITzZ L See Brawley JV 

CARROLL JJ FISHER GA ODLYZKO AM SLOANE 
NJA What are the Latin square groups? 
1045 

CHAKERIAN GD and KLAMKIN MS Inequalities 
for sums of distances 1009-1017 

CHINN WG See Gilmer GF 

CLEVELAND RICHARD A global characterization 
of uniform continuity 64-66 

CoHN PM Unique factorization domains 1-18 

Correction to “Unique factorization 
domains” 1115 

COLLINS GE Computer algebra of polynomials 
and rational functions 725-755 

CooKE WP Geometric fit of a monotonic cubic 
1047-1051 

CUPM Report to the Board of Governors 
August 1972 313-314 

Darst RB Simple proofs of two estimates for e 
194 

Davis MARTIN Hilbert’s tenth problem is un- 
solvable 233-269 

DEAKIN MAB Developing countries: A rejoinder 
806 


1187 


1188 


DeaAvours CA The quaternion calculus 995- 
1008 

De Boer DELMER See Williams HE 

DerRIcK WR A condition under which mapping 
is a homeomorphism 554-555 

DONNELLY HAROLD On a problem concerning 
Euler’s phi-function 1029-1031 

DriveR RD SasseER DW and SLATER ML The 
equation x’(t)=ax(t)+6x(t —T) with “‘small”’ 
delay 990-995 

Drospot VLADIMIR On sums of powers of a 
number 42-44 

EGGAN LC and Inset AJ A Wronskian condi- 
tion related to ordinary differential equations 
300-302 

Eipswick JA A crowded set of nonintersecting 
lines 415 

EISENBERG TA and Browne JB Using student- 
tutors in precalculus instruction 685-688 

Exiitis DF Economics as a minor for under- 
graduate mathematics majors 688-689 

Erpdés P and Guy RK Crossing number prob- 
lems 52-58 

FapeL.L AG A proof of the chain rule for deri- 
vatives in N-space 1134-1135 

FIsHER B On a problem of Besicovitch 785-787 

FisHER GA See Carroll JJ 

FLANDERS HARLEY Differentiation under the 
integral sign 615-627 

—-——-, Number fifty-two 1099 

FLANIGAN FJ Some half-plane Dirichlet prob- 
lems: A bare hands approach 59-61 

FREILICH GERALD Increasing continuous singular 
functions 918-919 

Fusaro BA The area of a hypersphere in Rie- 
mannian space 179-184 

GALE DAVID On the theory of interest 853-868 

GILMER R and ROSELLE D Complements and 
comments 1116-1118 

GILMER GF SINER HB MANSFIELD R and CHINN 
WG Concerns of two year colleges 1055-1057 

GOLDSTEIN LJ A history of the prime number 
theorem 599-615 

, Correction to “A history of the prime 
number theorem’ 1115 

GORDON WB Addendum to “On the diffeo- 
morphisms of Euclidean space’? 674-675 

GRAUDONS NANCY See Parberry EA 

GREENSPAN HP Applied mathematics at M. I. T. 
67-72 


INDEX TO VOLUME 80, 1973 


[December 


GREENSPAN DONALD A finite difference proof 
that E = mc? 289-292 

GREITZER S The first USA 
Olympiad 276-281 

GUSTAFSON WH What is the probability that 
two group elements commute? 1031-1034 

Guy RICHARD Monthly Research Problems 
1969-1973 1120-1128 

Guy RK See Erd6és P 

Guy RK and SELFRIDGE JL The nesting and 
roosting habits of the laddered parenthesis 
868-876 

HAHN LIANG-SHIN On an extension of the theo- 
rem of Hausdorff-Young 667—669 

HaAtmos PR The legend of John von Neumann 
382-394 

Ham MW The lecture method in mathematics: 
A student’s view 195-201 

HANNA AzMI On injective modules 297-298 

HEIMER RT See Jansson LC 

HEINEN JA and WILANSKY ALBERT A theorem 
on set inclusion in metric spaces 46-48 

HENRY BR Solution of Fejes Téth’s Illumination 
Problem 409-410 

HEeERSH RFUBEN How to classify differential poly- 
nomials 641-654 

Hiccins JJ Representing a finite Borel meas- 
ure in terms of its distribution function 
683-685 

HINDMAN NEIL Basically bounded sets and a 
generalized Heine-Borel theorem 549-552 

HIRSCHHORN MD How unexpected is the prime 
number theorem? 675-677 

HorRADAM AE See Shannon AG 

Hucus BB Survival for mathematics students 
689.--690 

InsEL AJ See Eggan LC 

JaAcoB HG Another proof of the rational de- 
composition theorem 1131-1134 

JAMESON GJO A problem on series 1119 

JANSSON LC and Heimer RT On behavioral 
objectives in mathematics education 930-933 

JENNER WE On non-associative algebras derived 
from graphs 288-289 

JONES JP and ToporowskI § Irrational numbers 
423-424 

KANTER MAREK Stable laws and the imbedding 
of LP spaces 403-407 

KAZARINOFF ND and WEITZENKAMP ROGER 
Squaring rectangles and squares 877-888 


Mathematical 


1973] 


KEMENY JG What every college president should 
know about mathematics 889-901 

KIEFFER JC A covering theorem 410-411 

KILPATRICK J See Polya G 

KIMBERLING CH ‘Two-dimensional complete 
monotonicity with diagonalization 789-791 

KLAMKIN MS See Chakerian GD 

KLARNER DA and Rapo R Linear combinations 
of sets of consecutive integers 985-989 

Lacey HE The Hamel dimension of any infinite 
dimensional separable Banach space is c 298 

LANGFORD Eric Distributivity over the Dirichlet 
product and completely multiplicative arith- 
metical functions 411-414 

LARNEY VIOLET H Female mathematicians, 
where are you? 310-313 

LEIBOWITZ GERALD The Cesaro operators and 
their generalizations: Examples in infinite- 
dimensional linear analysis 654-661 

LEONARD JL A _ discovery course in graph 
theory 1052-1053 

LINDSTROM PA Survival of the two-year college 
mathematics teacher 1135-1137 

LUXEMBURG WAJ What is nonstandard analysis? 
part II 38-67 

MANSFIELD R See Gilmer GF 

McKay JH The William Lowell Putnam Mathe- 
matical Competition 170-179, 1017-1028 

McSHANE EJ A unified theory of integration 
349-359 

The Lagrange multiplier rule 922-925 

MERRIS RUSSELL The permanent of a doubly 
stochastic matrix 791-793 

MEYER WALTER Equitable coloring 920-922 

MILLMAN RS and STEHNEY ANN K. The geometry 
of connections 475-500 

MINASSIAN DP Types of fully ordered groups 
159-169 

MINSKER STEVEN A _ familiar combinatorial 
identity proved by complex analysis 1051 

MonnaA AF Experiences with lectures on the 
history of mathematics in Utrecht 803-806 

Moore MH A convex matrix function 408-409 

Morpett LJ The sign of the Bernoulli numbers 
547-548 

Moser LEO Some mathematical verses 902 

NADLER SB The indecomposability of the 
dyadic solenoid 677-679 

NaAsH BO Reachability problems in vector 
addition systems 292-295 


AUTHOR INDEX 


1189 


NEUWIRTH LP See Bender EA 

NickEL PA Single layer potentials and the 
Cauchy-Kowalewski theorem 61-64 

OpDLYZKO AM On lattice points inside convex 
bodies 915-918 

See Carroll JJ 

O’HarA PJ Another proof of Bernstein’s theorem 
673-674 

PAPADIMITRIOU IOANNIS A simple proof of 
the formula Di )k72 = 12/6 424-425 

PAPANICOLAOU GC Stochastic equations and 
their applications 526-545 

PARBERRY EA and GRAUDONS NANCY When 
do all k-sequences modulo m have period 
one? 295-297 

PARKINSON CLAIRE Ambivalence in alternating 
symmetric groups 190-192 

PizeER AK A problem on rational functions 
552-553 

POLLARD H See Boas RP 

Potya G A letter by professor Pdélya 73-74 

Potya G and Kuirpatrick J The Stanford 
University competitive examination in mathe- 
matics 627-640 

PUTNAM Hi~ary Recursive functions 
hierarchies, part II 68-86 

Rapo R See Klarner DA 

RAMALEY WC Independent study for under- 
graduates 555-558 

RANDLES RH and SCHAEFFER AJ An integrated 
sequence in the mathematical sciences for 
undergraduate business students 431-433 

RECAMAN S BERNARDO Questions On a sequence 
of Ulam 919-920 

RIORDAN JOHN A note on Catalan parentheses 
904-906 

ROBINSON ABRAHAM Function theory on some 
nonarchimedian fields, part II 87-109 

ROSELLE D See Gilmer R 

RosMAN BH Another approach to the cubic 
interpolating spline 927-930 

RucuTE MEF and RYDEN RW A proof of 
uniqueness of factorization in the gaussian 
integers 58-59 

RUDIN WALTER A generalization of a theorem 
of Archimedes 794-796 

RYDEN RW See Ruchte MF 

SAMUELSSON AKE A local mean value theorem 
for analytic functions 45-46 

SASSER DW See Driver RD 


and 


1190 


SCHAEFFER AJ See Randles RH 

SCHOENBERG IJ The elementary cases of Landau’s 
problem of inequalities between derivatives 
121-158 

Scotr EJ Determination of the Riemann 
function 906-909 

SCOVILLE RICHARD See Carlitz L 

SELFRIDGE JL See Guy RK 

SHANNON AG and HorapAM AF Generalized 
Fibonacci number triples 187-190 

SHAPIRO HN A micronote on a functional 
equation 1041 

SHISHA O On the discrete version of Wirtinger’s 
inequality 755-760 

SIELAFF RW Perfect parallelograms 414—415 

SILVER JERRY and Waits Bert Multiple-choice 
examinations in mathematics not valid 
for everyone 937-942 

Simple groups 1028 

SINER HB See Gilmer GF 

SLATER ML See Driver RD 

SLOANE NJA See Carroll JJ 

SPENCER JOEL A deception game 416-417 

STANFORD DP Functions satisfying a mean 
value property at their zeros 665-667 

STEEN LA Highlights in the history of spectral 
theory 359-381 

STEHNEY ANN K_ See Millman RS 

SWENSON JR The chromatic polynomial of a 
complete bipartite graph 797-798 

TAYLOR PD A Banach space characterization 
of the space of affine continuous functions 
on a compact convex set 911-915 

TOPOROWSKI S See Jones JP 

TOTH Fejes L Exploring a planet 1043-1044 

TuLt JP A discovery approach to e 193--194 


KEY WORDS AND 


Absolutely continuous functions VAN VLECK 
FS 286 

Additiye functions SHAPIRO HN 1041 

Albert AA ZELINSKY D 661 

Algebra BiIRKHOFF G 760 

Algorithms CoLLIns GE 725 

Alternating groups PARKINSON C 190 

Analytic function WILLIAMS RK 299 

Applicable mathematics BENDER EA 302 

Applications WILSON RL 1053 


INDEX TO VOLUME 80, 1973 


[December 


ULLMAN JL An area theorem for Schlicht func- 
tions 184-186 

VAN VLECK FS A remark concerning absolutely 
continuous functions 286-287 

VAUGHT RL Some aspects of the theory of 
models, part II 3-37 

WalIts Bert Individualized instruction in large 
enrollment mathematics courses 307-310 

——— See Silver Jerry 

WALSH JL History of the Riemann mapping 
theorem 270-—276 

WALTER JOHANN On elementary proofs of 
Peano’s existence theorems 282-286 

WATTERS JF See Ault JC 

WEGNER BERND Existence of four concurrent 
normals to a smooth closed hypersurface 
of E”’ 782-785 

WEITZENKAMP ROGER See Kazarinoff ND 

WILANSKY ALBERT See Heinen JA 

WILLcox AB England was lost on the playing 
fields of Eton: A parable for mathematics 
25-40 

WILLIAMS HE and DE BoER DELMER Teaching a 
computer-oriented laboratory course for or- 
dinary differential equation 933-937 

WILLIAMS RK A note on conformality 299-300 

WILLMORE T Correction to ““The Math Societies 
and Associations in the U.K.” 876 

WILSON RJ An introduction to matroid theory 
500-525 

WILSON RL A bow to relevancy 1053-1055 

WyMaN BF Correction to: What is a reciprocity 
law? 281 

ZAHN CT Alternating Euler paths for packings 
and covers 395-403 

ZELINSKY D A.A.Albert 661-665 


PHRASES INDEX 


Applied mathematics education GREENSPAN 
HP 67 

Approximation by power sums Drosor V 42 

Area theorem ULLMAN JL 184 


Awards 117 120 


Banach space HEINEN JA & WILANSKY A 
46 

Behavioral objectives JANSSON LC & HEIMER 
RT 930 


1973] 


Bernoulli numbers ApostoL TM 425 MorbDELL 
LJ 547 Caruitz L & SCOVILLE R 548 

Bernstein’s theorem O’HARA PJ 673 

Bessel polynomials BARNES CW 1034 

Bipartite graph SWENSON JR 797 

Bore! measure HIGGINS JJ 683 

Brownian motion BILLINGSLEY P 1099 

Business students RANDLES RH & SCHAEFFER AJ 
431 


Calculus course BENDER EA 302 

Catalan numbers Guy RK & SELFRIDGE JL 
868 RIORDAN J 904 

Cauchy problem Scotr EJ 906 

Cesaro operators LEriBOWITZ G 654 

Chain rule FapDELL AG 1134 

Characteristic polynomial BuRRow MD 1129 

Chromatic number MEYER W 920 

Circle group AULT JC & Watters JF 48 

College president KEMENY JG 889 

College teaching HUGHES BB 689 

Coloring MEYER W 920 

Combinatorial identity MINSKER S 1051 

Committee on two-year colleges GILMER GF 
SINER HB MANSFIELD R & CHINN WG 
1055 

Commuting elements group GUSTAFSON WH 
1031 

Compactness theorem VAUGHT RL June-July 
part II 3 

Competition Putnam McKay JH 170 1017 

Complements and comments GILMER R & 
ROSELLE Davip 1116 

Completely monotone matrices 
CH 789 

Computer algebra COLLINS GE 725 

Computer differential equations WILLIAMS HE 
& DE Borer D 933 

Concurrent normals WEGNER B 782 

Conformal mapping Wa.LsH JL 270 WILLIAMS 
RK 299 

Connections MILLMAN RS & STEHNEY AK 475 

Consecutive integers KLARNER DA & RADO R 
985 

Convex sets TayLtor PD 911 Feyes Toru L 
1043 

Covering theorem KIEFFER JC 410 

Crises in mathematics BIRNBAUM S 545 

Crossing numbers Erpés P & Guy RK 52 

CUPM 313 


KIMBERLING 


KEY WORDS AND PHRASES INDEX 


1191 


Decay equation DrIveER RD Sasser DW & 
SLATER ML 990 

Deception game SPENCER J 416 

Decision making KEMENY JG 889 

Derivations CALLAHAN FP 40 

Developing countries DEAKIN MAB 806 

Diffeomorphisms GORDON WB 674 

Differentiability BRUCKNER AM 679 

Differential-difference equation Driver RD 
SASSER DW & SLATER ML 990 

Differential geometry MILLMAN RS & STEHNEY 
AK 475 

Differential polynomials HERSH R 641 

Differentiation under the integral sign FLANDERS 
H 615 

Diophantine equation Davis M 233 

Dirichlet problems FLANIGAN FJ 59 

Discovery course LEONARD JL 1052 

Distribution function CooKE WP 1047 

Doubly stochastic matrix MERRIS R 791 


e TULL JP 193 Darst RB 194 

Economics GALE D 853 

Economics minor ELLis DF 688 

Electric circuits KAZARINOFF ND & WEITZEN- 
KAMP R 877 

Energy and mass GREENSPAN D 289 

Euler function DONNELLY H 1029 

Euler numbers Caruitz L & SCOVILLE R 548 

Euler paths ZAHN CT 395 

Euler’s formula PAPADIMITRIOU I 424 APposTOL 
TM 425 

Euler totient ALTER R 192 


Fibonacci triples SHANNON AG & HoRADAM 
AF 187 


Gaussian integers RUCHTE MF & RYDEN RW 58 

Geometric inequalities CHAKERIAN GD & 
KLAMKIN MS 1009 

Graphs JENNER WE 288 

Graphs in the plane Erpés P & Guy RK 52 

Graph theory WILSON RJ 500 LEONARD JL 
1052 

Group, commuting elements GUSTAFSON WH 
1031 


Hamel dimension LAcEYy HE 298 
Harmonic space FusAaro BA 179 
Hausdorff-Young theorem HAHN LS 667 


1192 


Heine-Borel theorem HINDMAN N 549 
Hierarchies PUTNAM H June-July part II 68 
Hilbert space STEEN LA 359 

Hilbert’s 10th problem Davis M 233 

History of algebra BIRKHOFF G 760 

History of mathematics MONNA AF 803 
Homeomorphism DERRICK WR 554 
Hyperbolic differential equation Scorr EJ 906 
Hypersphere Fusaro AB 179 


Illumination problem HENRY BR 409 

Indecomposable continua NADLER SB 677 

Independent study RAMALEY WC 555 

Individual instruction Watrs B 307 

Inequalities CHAKERIAN GD & KLAMKIN MS 
1009 

Inequalities between derivatives SCHOENBERG IJ 
121 

Infinite integrals Boas RP & PoLLarRD H 18 

Infinite series Boas RP & PoLttarD H 18 
JAMESON GJO 1119 

Infinitesimals LUXEMBURG WAJ June-July part II 
38 

Injective module HANNA A 297 

Integration MCSHANE EJ 349 

Interest GALE D 853 

Irrational numbers Jones JP & ToPoROWSKI S 
423 


Kakeya problem FISHER B 785 
Knots BoLKER ED 977 


Lagrange multipliers MCSHANE EJ 922 

Landau’s problem SCHOENBERG IJ 121 

Laplace transforms BENDER EA & NEUWIRTH 
LP 417 Scort EJ 906 

Large courses WalItTs B 307 

Latin squares CARROLL JJ FisHeER GA ODLYZKO 
AM SLOANE N 1045 

Lattice points ODLyzkKo AM 915 

Lebesgue integral McSHANE EJ 349 

Lecture method Ham MW 195 

Leibnitz rule FLANDERS H 615 

Linear connections MILLMAN RS & STEHNEY 
AK 475 

Lowenheim-Skolem theorem VAUGHT RL June- 
July part II 3 

LP spaces KANTER M 403 


Markov processes PAPANICOLAOU GC 526 
Mathematics Association WILLMORE T 876 


INDEX TO VOLUME 80, 1973 


[December 


Mathematics contest POLYA G & KILPATRICK J 
627 

Matrices over a finite field BRAWLEY JV & 
CarRuiTz L 670 1041 

Matrix function MoorE MH 408 

Matroid WILSON RJ 500 

Mean value property STANFORD DP 665 

Mean value theorem SAMUELSSON A 45 

Metric spaces HEINEN JA & WILANSKY A 46 

Minimal polynomial BURROW MD 1129 

Models VAUGHT RL June-July part II 3 KEMENY 
JG 889 

Moser Leo 902 

Multiple choice examinations SILVER J & Waits 
B 937 

Multiple functions LANGFoRD E 411 


Nilpotent ring Autt JC & Watters JF 48 

Non-archimedean fields ROBINSON A June-July 
part II 87 

Non-intersecting lines Erpswick JA 415 

Non-standard analysis ROBINSON A 87 VAUGHT 
RL June-July part II 3 

Normals of a hypersurface WEGNER B 782 

Numerical algebra BiRKHOFF G 760 


Olympiad GREITZER SL 276 
Ordered fields ROBINSON A June-July part II 87 
Ordered groups MINASSIAN DP 159 


Parable Wittcox AB 25 

Parentheses Guy RK & SELFRIDGE JL 868 
RIORDAN J 904 

Partially ordered groups MINASSIAN DP 159 

Peano’s existence theorem WALTER J 282 

Perfect parallelograms SIELAFF RW 414 

Permanent MERRIS R 791 

Plank problem Frses TOTH L 1043 

Polynomials CoLLins GE 725 

Precalculus instruction EISENBERG TA & BROWNE 
JB 685 

Prime number BILLINGSLEY P 1099 

Prime number theorem GOLDSTEIN 
HiIRSCHHORN MD 675 

Problem solving Pétya G 73 

Product topology CAMERON D 925 

Promotion Pétya G 73 

Putnam competition McKay JH 170 1017 


LJ 599 


Quaternion calculus DEAvours CA 995 


1973] 


Rational canonical form JAcop HG 1131 

Rational functions PIZER AK 552 

Reciprocity law WYMAN BF 281 

Recursive functions PUTNAM H_ June-July 
part II 68 

Relevance Wiittcox AB 25 

Research problems Guy RK _ 1120 

Riemann mapping theorem WarsH JL 270 

Row echelon form BARKER GP 787 


Schlicht functions ULLMAN JL 184 

Sequences RECAMAN B 919 

Sequences of integers PARBERRY EA & GRAU- 
DONS N 295 

Set of discontinuity ALAs OT 186 

Simple groups 1028 

Single layers NIcKEL PA 61 

Singular functions FREILICH G 918 

Space volume RUDIN W 794 

Spectral theory STEEN LA 359 

Spinor BoLKER ED 977 

Splines SCHOENBERG IJ 121 RosMAN BH 927 

Squaring rectangles KAZARINOFF ND & WEIT- 
ZENKAMP R 877 

Stanford competition P6LYA G & KILPATRICK 
J 627 

Stochastic equations PAPANICOLAOU GC 526 

Stochastic integral KANTER M 403 

Student tutors EISENBERG TA & BROWNE JB 
685 


KEY WORDS AND PHRASES INDEX 


1193 


Summability theory Leinowlitz G 654 

n— 
Summation of "~? Papapimirriou I 424 
Survival LINDSTROM PA 1135 


Teacher training AICHELE DB 798 

Teaching HuGHES BB 689 

Totient DONNELLY H_ 1029 

Traffic flow BENDER AE & NeEuwirtH LP 417 

Triangles CARLITZ L 910 

Turing machines PUTNAM H June-July part IJ 
68 

Two-year colleges GILMER GF SinNER HB 
MANSFIELD R & CHINN WG 1055 

Two-year college teacher LINDSTROM PA 1135 


Uniform continuity CLEVELAND R 64 

Unique factorization RUCHTE MF & RYDEN 
RM 58 

Unique factorization domain CoHN PM 1 1115 


Vector addition system NASH BO 292 
Venezuela AICHELE DB 798 
Von Neumann John Hatmos PR 382 


Wirtinger’s inequality SHIsHA O 755 

Women in mathematics LARNEY VH 310 Assoc. 
FOR WOMEN IN MATH. 903 

Wronskian EGGAN LC & Inset AJ 300 


Zeta function Aposro. TM 425 


1194 INDEX TO VOLUME 80, 1973 [December 


PROBLEMS AND SOLUTIONS 


PROBLEMS PROPOSED 


Alexander JC 440 Grosch CB 435 O’Brien George 943 

Al Salam WA 315 Hahn L.-S 943 O’Farrell AG 814 
Andrushkiw JW 1067 Harris LA 697 Pomerance Carl 949 
Apostol TM 1058 Hemperly JC 1058 Recaman Bernardo 434 1057 
Barnes FW 434 Heuer GA 325 Redheffer Ray 1138 
Battany DM 565 Hoshek Lyles 691 Reese Sylvester 209 
Bernhart Frank 208, 324 Howard FT 559 Reingold EM 691 

Bloom DM 1147 Hsieh SC 325 Renz PL 949 

Boas RP 814 Hubbard JH 324 Ringel CM 82 

Boyd AV 434 Hwang FK 1058 Ruderman HD 82 807 1139 
Brons KA 202 Ivanoff VF 203 Schaumberger Norman 316 
Buck RC 691 1067 Jackson DE 1058 Schurle AW 209 

Buckley JJ 949 Johnson CR 814 Selucky K 1067 

Chen SY 325 Johnson Wells 943 Shafer RE 316 

Cohn Paul 697, 814 Just Erwin 76 316 Singmaster David 75 
Cooper CDH 559 Kamp JF 83 Sivaramakrishnan R 1147 
Dashiell FK 83 Kirk RB 564 Siwiec Frank 1139 

Daykin DE 202 564 Klamkin MS 75 75 807 Smythe RT 943 

Deutsch Emeric 814 Knight Bill 564 Smyth CJ 949 

Dixon ED 324 Kuzam FA 82 Spencer Joel 209 

Dlab V 82 Laugwitz Detlef 1067 Stanley Richard 949 

Dodge CW 202 Letac Gerard 440 441 1139 Stern Frederick 949 

Doyle JK 82 Leuenberger F 1138 Stewart BM 691 

Dugdale JK 564 Linderholm CE 564 Styer David 1067 

Eakin PM 441 Long CA 807 Tomescu Ioan 559 
Ehrenfeucht Andrzej 697 Masley John 808 Umberger Edmund 1058 
Entringer RC 1058 Mauldon JG 697 Uppuluri VRR 74 

Gentile ER 324 Maurer Russell 316 Walker AW 202 316 560 
Girod Donald 814 McConnell Alan 808 Wallen LJ 1147 

Glasser ML 440 McLean David 943 Wang ETH 691 692 808 943 1139 
Goldberg Michael 434 Minassian Donald 1146 Washington LC 697 
Golomb SW 697 Murray PJ 692 Wendel JG 325 559 

Good IJ 209 Mycielski Jan 1146 Wilansky Albert 325 564 1067 
Gould HW 75 Myhill John 83 Wolk Barry 434 

Greitzer SL 75 Newman David 203 

Groenewoud Cornelius 1058 Nicol CA 560 692 


PROBLEMS SOLVED 


Andersen KF 1150 Carty Frederick 317 813 Gardner CS 1139 
Annulis JT 1071 Chakerian GD 562 Garfield Ralph 321 
Atkinson DW 1144 Charnow Allen 1061 Gerber Leon 1145 
Bauman Norman 561 Chouteau Charles 693 Gerst Irving 214 
Belanger DG 567 1068 Comiskey John 320 Gibbs Richard 1066 
Bennett Coll. Team 1060 Converse GA 562 Gilmer Robert 944 
Bern Switzerland Problem Solving Coolidge John 947 Goldberg Michael 692 
Group 319 Coppersmith Don 815 Greening MG 436 439 694 695 1143 
Bernau SJ 87 D’Alarcao H 320 Grimm CA 80 
Bloom’ DM 206 320 563 1142 Davies RO 87 Grossman JW 210 
Borwein David 698 DeMeyer Frank 83 Gudder SP 566 
Breiteig Trygve 696 Dickson RJ 435 438 Guggenheimer H 211 
Breusch Robert 809 Djokovic DZ 1151 Hertz Ellen 811 951 
Buck RC 1144 Evans RJ 560 Heuer GA 952 
Burke PJ 207 Felsinger Neal 327 Hobbs AM 950 
Buschman RG 213 Ferrero Bruce 1069 Huddleston Nancy 700 
Butler JG 443 Fine NJ 435 Israel RB 329 


Carlitz Leonard 441 819 Galvin Fred 950 Jagers AA 85 815 1150 


1973] 


Johnson Wells 207 
Jondrup Soren 212 
Kappus Hans 810 
Kestelman H 1062 
Klamkin MS 323 1144 
Klasi ML 80 

Klein EM 326 
Kuenzi NJ 439 
Langford ES 1063 
Lass Harry 84 1070 
Leibowitz GM 566 568 
Leitch RD 1149 
Linus V 1144 

Little JG 1071 
Lossers OP 818 1064 
Makowski Andrzej 78 
Mattics LE 1139 
Mauldon JG 79 81 946 
Meir Amram 444 
Mitra SS 206 

Monk David 317 
Montgomery PL 446 


PROBLEMS AND SOLUTIONS INDEX 


Moore T 320 
Moser WOJ 437 

Niven Ivan 812 
Pierce Stephen 443 
Prielipp Bob 439 
Prostanstus LP 438 
Quackenbush RW 950 
Robinson GB 77 203 
Rose-Hulman Sol Group 1143 
Rosenthal Eric 445 
Rousseau CC 328 700 
Ruehr OG 1071 

St Olaf College Students 436 945 1146 
Schmitt FG 701 
Schuurmann Fred 817 
Scoville Richard 79 441 
Segal AC 83 

Smith JR 1068 

Snow Wolfe 565 
Spear David 86 
Spindler Stephen 76 
Starke EP 77 


SOLUTIONS 


1195 


Stein Alan 1059 
Stenger Allen 1148 
Stocker Harold 1066 
Stoll Manfred 698 
Tang Hwa 1061 
Taylor Herbert 695 
Taylor RL 698 
Temple Univ Probl Solv Group 948 
Thomas Gomer 818 
Thomas Martin 808 
Torchinelli Guy 1064 
Trost EW 77 

Vitale Mike 946 
Walker AW 203 204 205 
Wall CR 696 

Wetzel JE 562 
Wilkins JE Jr 1061 
Wilson Norman 1062 
Witsenhausen HS 318 
Wong William 442 
Editorial Note 561 


Numbers in boldface type refer to problems, those in lightface, to pages 


E-1085 808 E-2245 809 E-2293 317 E-2294 692 


E-2330 76 E-2332 
E-2334 79 E-2335 
E-2337 203 E-2338 
E-2340 206 E-2341 
E-2343 317 E-2345 
E-2347 321 E-2348 
E-2351 436 E-2352 
E-2354 439 E-2355 
E-2357 439 E-2358 
E-2360 562 E-2361 
E-2363 694 E-2364 
E-2366 695 E-2367 
E-2369 813 E-2370 
E-2372 946 E-2373 
E-2375 947 E-2376 
E-2379 1060 E-2380 
E-2382 1064 E-2383 
E-2388 1142 EK-2391 
E-2395 1146 


77 
80 
204 
206 
318 
323 
436 
560 
561 
693 
563 
696 
944 
1139 
948 
1061 
1066 
1143 


5814 83 5815 83 5816 84 5817 85 
E-2333 78 5818 86 5820 87 5821 210 
E-2336 81 5822 211 5823 212 5824 213 
E-2339 205 5825 214 5826 326 5827 326* 
E-2342 207 5828 327 5829 327 5830 328 
E-2346 319 5831 329 5832 441 5833 442 
E-2350 435 5834 443 5835 443 5836 444 
E-2353 437 5837 445 5838 445 5839 446 
E-2356 810 5840 565 5841 566 5842 566 
E-2359 561 5843 567 5844 568 5845 698 
E-2362 810 5846 698 5847 699 5848 700 
E-2365 695 5849 701 5850 815 5851 815 
E-2368 812 5852 817 5853 818 5854 819 
E-2371 945 5855 950 5856 950 5857 951 
E-2374 946 5858 952 5859 1068 5862 1068 
E-2378 1059 5863 1069 5864 1070 5865 1147 
E-2381 1062 5867 1148 5868 1070 5869 1071 
E-2386 1141 5870 1150 5871 1150 5873 1151 
H-2393 1145 


* Editorial note 326 


1196 


INDEX TO VOLUME 80, 1973 


[December 


REVIEWS 


Names of authors are in ordinary type, those of reviewers in capitals. 


Apostol TM Selected Papers on Calculus 
DorROTHY K_ BERNSTEIN 93-94 

Bechtell Homer The Theory of Groups RE 
PHILLIPS 447 

Beck Anatole Bleicher MN and Crowe DW 
Excursions into Mathematics JAMES STASHEFF 
821 

Behrens Ernest-August Ring Theory N DIVINSKY 
95 

Blakeslee DW and Chinn WG Introductory 
Statistics and Probability: A Basis for 
Decision Making DS Moore 214 

Bleicher MN See Beck Anatole 

Blumenthal LM and Menger Karl Studies in 
Geometry A BRUEN 330 

Brauer Fred Nohel JA Schneider Hans Linear 
Mathematics: An Introduction to Linear 
Algebra and Linear Differential Equations 
WS Loup 451 

Campbell HG Linear Algebra with Applications: 
Including Linear Programming DE CHRISTIE 
702 

Chinn WG See Blakeslee DW 

Chover Joshua The Green Book of Calculus 
AD KRAMER 955-956 

Cohn PM Free Rings and their Relations DJ 
FIELDHOUSE 573 

Crowe DW See Beck Anatole 

Davenport WB Jr Probability and Random 
Processes JL SNELL 88-90 

Durbin JR Mathematics: Its Spirit and Evolution 
RANDALL LONGCORE 1074-1075 

Dwass Meyer Probability Theory and Applications 
JL SNELL 88-90 

Edwards AL Probability and Statistics DS 
Moore™e 215 


Embry Mary Schell JF Thomas JP Calculus 
and Linear Algebra: an Integrated Approach 
RT Hoop 454 

Ericksen GL Scientific Inquiry in the Behavioral 
Sciences: an Introduction to Statistics 
DS Moore 215 

Finkbeiner DT Elements of Linear Algebra 
DE CnuristiE 702 

Folks Leroy See Kempthorne Oscar 

Gillett Philip Linear Mathematics WS Loup 451 


Goodman AW Ratti JS Finite Mathematics 
with Applications ELIZABETH BERMAN 91-92 
TW CASSTEVENS 91 

Gratzer George Lattice Theory: First Concepts 
and Distributive Lattices DT FINKBEINER 824 

Gray Mary W Calculus with Finite Mathematics 
for Social Sciences GERALD GIACCAI and 
KENNETH SLONNEGER 1076-1077 

Herstein IN and Sandler R Introduction to the 
Calculus JV Lewis 90-91 

Hilton PJ and Stammbach U A Course in Homolo- 
gical Algebra CLAUDE SCHOCHET 1153-1154 

Kaplansky Irving Set Theory and Metric Spaces 
WR ParK 953-955 

Kempthorne Oscar and Folks Leroy Probability, 
Statistics and Data Analysis J KiEFER 822 

Kochendorffer Rudolf Group Theory RE 
PHILLIPS 447 

Kreyszig Erwin Introductory Mathematical Sta- 
tistics BP KoRIN 454 

Larsen MD McCarthy PJ Multiplicative Theory 
of Ideals HH Bruncs 94 

LeLionnais F (ed.) Great Currents of Mathe- 
matical Thought KO May and SB REGo- 
CZEI 825 

Levine Arnold Theory of Probability LJ SNELL 
88-90 

Long PE An Introduction to General Topology 
HELEN E SALZBERG 1077 

Lukacs Eugene Probability and Mathematical 
Statistics: an Introduction DS Moore 214 

Maunder CRF Algebraic Topology MJ Powers 
449 

McCarthy PJ See Larsen MD 

Mendenhall W Introduction to Probability and 
Statistics, Third Edition DS Moore 215 

Menger Karl See Blumenthal LM 

Meyer PL Introductory Probability and Statistical 
Applications 2nd edition RF BARNES 1075 

Mihalek RJ Projective Geometry and Algebraic 
Structures BURNETT MEYER 1072-1074 

Noether GE Introduction to Statistics: a Fresh 
Approach EL DOLNEY 335 

Nohel JA See Brauer Fred 

O’Nan Michael Linear Algebra DE CHrisTIE 702 

Penney DE Perspectives in Mathematics ET 
ORDMAN 568 


1973] 


Ratti JS See Goodman AW 

Readings for Mathematics: a Humanistic Ap- 
proach ET ORDMAN 569 

Reed Michael and Simon Barry Methods of 
Mathematical Physics, V. I.: Functional 
Analysis JA GOLDSTEIN 1152-1153 

Reiner Irving Introduction to Matrix Theory 
and Linear Algebra DE CurIstTiE 702 

Resnikoff HL and Wells RO Mathematics in 
Civilization ET ORDMAN 568 

Roberts AW Introductory Calculus with Analytic 
Geometry and Linear Algebra 2nd ed. 
BURNETT MEYER 956 

Rogers Andrei Matrix Methods in Urban and 
Regional Analysis DE CHRISTIE 702 

Sandler F See Herstein IN 

Schell JF See Embry Mary 

Schneider Hans See Brauer Fred 

Semple JG See Tyrrell JA 

Simon Barry See Reed Michael 

Smirnov VI Linear Algebra and Group Theory 
DE Christie 702 


REVIEWS INDEX 


1197 


Snapper Ernst and Troyer RJ Metric Affine 
Geometry A BRUEN 330 A ADELBERG 333 

Spector Lawrence Liberal Arts Mathematics 
ET ORDMAN 569 

Spivak Michael A Comprehensive Introduction 
to Differential Geometry AD KRAMER 448 

Stammbach U See Hilton PJ 

Thomas JP See Embry Mary 

Troyer RJ See Snapper Ernst 

Tyrrell JA and Semple JG Generalized Clifford 
Parallelism A BRUEN 330 

Ward LE Topology: An Outline for a First 
Course ANN K. STEHNEY 823 

Wells RO See Resnikoff HL 

Wimbish GJ Mathematics: A Humanistic Ap- 
proach ET ORDMAN 569 

Wylie CR Jr Introduction to Projective Geometry 
BURNETT MEYER 1072-1074 

Young DM Iterative Solution of Large Linear 
Systems LA HAGEMAN 92-93 

Editorials 821 

Editorial Notice 475 


1198 


INDEX TO VOLUME 80, 1973 


[December 


TELEGRAPHIC REVIEWS 


Aaker David A (Ed) Multivariate Analy- 
sts in Marketing Theory and Appltea- 
tton 583 

Aarhus U Papers from the Open House for 
Probabiltsts 228 

Aarhus U Papers from the Open House for 
Funettonal Analysts 461 

Achenbach JD Contrtbuttons to the Theo- 
ry of Atreraft Structures 583 

Adams JF Algebrate Topotlogy--A Stu- 
dent's Gutde 107 

Adamson Iain T Rings Modules and Alge- 
bras 713 

Afifi AA Azen SP Statistteal Analysis 
A Computer Ortented Approach 462 

Akivis MA Goldberg VV Introductory It- 
near Algebra 102 

Alavi Y Lick DR White AT (Ed) Lecture 
Notes in Mathematies-3808 711 

Albert Arthur Regression and the Moore- 
Penrose Pseudotnverse 463 

Alder Henry L Roessler EB Introduetton 
to Probabtltty and Statisttes Fifth 
Editton 344 

Allen Jr Richard C See Shampine LF 

Altwerger Samuel I Modern Mathemattes 
An Introduetton 1078 

AMS Transactions of the Moscow Mathe- 
matical Soetety for the Year 1969 
V 20 97 

AMS Transacttons of the Moscow Mathe- 
matteal Soetety for the Year 1970 
V 21 97 

AMS Transacttons of the Moseow Mathe- 
matteal Soetety for the Year 1970 
V 22 97 

Ambrose William G Trigonometry A Fune- 
tional Approach 829 

Amstadter Bertram L Reltabiltty Mathe- 
mattes Fundamentals Practtees Pro- 
cedures 109 

Anderson Dan See Taylor JG 

Anderson TW Gupta SD Styan GPH A Btb- 
Ltography of Multivartate Statistt- 
eal Analysis 967 

Andree David D See Andree RV 

Andree Josephine P See Andree RV 

Andree Richard V Andree JF Andree DD 
Computer Programming Techniques An- 
alysts and Mathematics 718 

Anger Arthur L Computer Setence The 
PL/1 Language 229 

Ansorge R Tornig W Lecture Notes in 
Mathemattes-267 105 

Anton Howard Elementary Linear Alge- 
bra 576 


Anton Howard Elementary Linear Algebra 
338 

Applebaugh Gwendolyn Neul See Taylor JG 

Arndt Ole Jensen FV An Introduetton to 
a Dtseusston on Dialetie Material- 
tsm and Mathemattes 222 

Arnold William R See Schminke CW 

Artin Michael Théoréms de Représent- 
abt1ltté Pour Les Espaces Algébri- 
ques 576 

Artin M Grothendieck A Verdier JL Lec- 
ture Notes in Mathemattes-269 224 

Leeture Notes in Mathemattes-270 


459 


Leeture Notes in Mathemattes-305 

1081 

Artmann Benno Eine Etnfuhrung in dte 
Algebra 960 

Ashley John P Harvey ER Modern Geome- 
try 958 

Ashour S Lecture Notes tn Economics and 
Mathematical Systems-69 461 

Athreya KB Ney PE Branehtng Processes 
1086 

Aubuchon III William E See Moran Jr MM 

Aucoin Clayton V See Ohmer MM 

Auslander Louis Mathematics Through 
Stattsttes 574 

Averill EW Elements of Statisttes 229 

Avila Geraldo SS Lectures on the Wave 
Equatton 340 

Azen SP See Afifi AA 

Aziz AK (Ed) The Mathematteal Founda-~ 
ttons of the Fintte Element Method 
wtth Applteattons to Parttal Differ- 
enttal Equattons 341 

Babakhanian Ararat Cohomologteal Me- 
thods tn Group Theory 337 

Babich VM (Ed) Mathemattecal Problems in 
Wave Propagatton Theory Part III 113 

Bachman George See Narici L 

Bagchi TP Templeton JGC Lecture Notes 
tn Economies and Mathematical Systems 
72 1085 

Bailey Daniel E Probability and Statis- 
ttes Models for Research 462 

Bailey Jr Walter L Introductory Lectures 
on Automorphte Forms 962 

Bajpai AC Fortran and Algol A Programmed 
Course for Students and Technology 837 

Bajpai OP Foundations of Stattsttes 717 

Balaam Leslie N See Federer WT 

Bar-Hillel Yehoshura See Fraenkel AA 

Barker George Phillip See Schneider H 

Barlow RE Stattstteal Inference under 
Order Restrtetions 968 


1973] 


Barnes JA See Murdoch J 
Barnett IA Elements of Number Theory 
Revised Edttton 711 
Barrett James P Elementary Computer 
Programs for Stattstteal Analysts 
582 
Battersby Albert Network Analysis for 
Planning and Sehedultng Third Kdt- 
tton 112 
Bauer F Garabedian P Korn D Lecture 
Notes tn Economtes and Mathematteal 
Systems-66 719 
Bauer Heinz Probability Theory and 
Elements of Measure Theory 108 
Bavinck H Jacobi Sertes and Approxima- 
tion 579 
Baxter Willard E Sloyer Clifford W 
Caleulus with Probabtltty For the 
Life and Management Seteneces 961 
Bear HS Algebra and Elementary Fune- 
ttons 339 
Algebra for College Students 339 
Beard Robert M See Copi IM 
Beauchamp Murray A Elements of Mathe- 
matteal Soetology 230 
Beauregard Raymond A Fraleigh JB A 
First Course tn Linear Algebra With 
Opttonal Introduction to Groups 
Rings and Fields 459 
Beck Anatole (Ed) Lecture Notes in Ma-~ 
themattes-318 966 
Beckenstein Edward See Narici L 
Beckman David N See Crouch Ralph B 
Beckmann Petr The Structure of Langu- 
age A New Approach 1088 
Orthogonal Polynomtals for Engt- 
neers and Phystetsts 1089 
Bedford FW Dwivedi TD Vector Caleulus 
103 
Behzad Mehdi Chartrand G Introduectton 
to the Theory of Graphs 100 
Belinfante Johan GF Kolman B A Survey 
of Lie Groups and Lie Algebras with 
Applicattons and Computational Me- 
thods 102 
Belkner Horst Metrische Raume 1162 
Bell JL Slomson AB Models and Ultra- 
produets An Introductton 221 
Bellman Richard Perturbation Techni- 
ques tn Mathemattes Phystes and En- 
gineering 963 
Methods of Nonlinear Analysts V II 
965° 
Bendersky M Generalized Cohomology and 
K-Theory 227 
Benice Daniel D Artthmetic and Algebra 
221 
Benjamin B Haycocks HW The Analysis of 
Mortaltty and Other Actuarial 


REVIEWS INDEX 


1199 


Stattsttes 719 

Benney David J See Greenspan HP 

Berberian Sterling K Baer*-Rings 223 

Berenstein Carlos A Dostal MA Lecture 
Notes tn Mathemattes-256 715 

Berman Gerald Fryer KD Introduectton to 
Combtnatortes 100 

Berman Simon L See Dolciani MP 

Berston Hyman Maxwell Fisher P Collegt- 
ate Business Mathemattes 829 

Bézier P Nwnerical Control Mathemattes 
and Applteattons 464 

Bhagavantam S Venkatarayudu T Theory of 
Groups and Its Applieatton to Physt- 
eal Problems 112 

Bharucha~Reid AT Random Integral Equa- 
tions 580 

Bhat U Narayan Elements of Applted 
Stochastte Processes 108 

Bhattacharya PB Jain SK First Course tn 
Group Theory 960 

Bicknell Marjorie Hoggatt Jr WE (Ed) A 
Primer for the Fibonaecet Numbers 338 

Birkhoff Garrett Hall Jr M Computers in 
Algebra and Number Theory 224 

Bishop Errett Cheng H Constructtve 
Measure Theory 715 

Bitter Gary G See Dorn WS 

Blackith RE Reyment RA Multivariate 
Morphometrics 464 

Blatter Jorg Grothendteck Spaces in Ap- 
proxtmate Theory 341 

Blum EK Wumertcal Analysts and Computa- 
tton Theory and Practice 340 

Blum Julius R Rosenblatt JI Probability 
and Stattsttes 462 

Bohigian Haig Edward The Foundattonse and 
Mathematteal Models of Operattons Re-~ 
search wtth Extenstons to the Crimi- 
nal Justtee System 719 

Bolzano Bernard Theory of Setence 457 

Boone WW Cannonito FB Lyndon RC (Ed) 
Word Problems Deetston Problems and 
the Burnstde Problem tn Group Theory 
834 

Borel Armand Lecture Notes in Mathema- 
ttes-276 713 

Borsuk K Theory of Shape 227 

Bott R Gitler S James IM Lecture Notes 
tn Mathemattes-279 107 

Boullion Thomas L Odell PL Generaltzed 
Inverse Matrices 223 

Bourbaki Nicolas Elements of Mathema- 
ttes Commutative Algebra 713 

Bousfield AK Kan DM Lecture Notes in 
Mathemattes-304 835 

Bowen Earl K Mathemattes with Appltea- 
tions tn Management and Economics 


1200 


Third Edttron 583 

Bower Julia Wells Mathematics A Creq- 
tive Art 1155 

Bowley AL FY Edgeworth's Contrtbu- 
ttons to Mathematical Stattsttes 
1162 

Braverman Jerome D Probability Logte 
and Management Decistons 229 

Bredon Glen E Introductton to Compact 
Transformatton Groups 343 

Breiman Leo Statisties wtth a View To- 
ward Applications 836 

Brent Richard P Algorithms for Minimt- 
zatton Without Derivatives 341 

Bretagnolle JL Lecture Notes tin Mathe- 
mattes-3807 1086 

Brewer James W Rutter EA (Ed) Lecture 
Notes tn Mathemattes-311 713 

Brézis H Operateurs Maxtmaux Monotones 
et Semi-Groupes de Contractions dans 
les Espaces de Hilbert 964 

Brinker Orason L See Plachy JM 

Brodskil MS Triangular and Jordan Re- 
presentations of Linear Operators 
578 

Brody Linda A See Silverman EN 

Bronshtein IN Semendyayev KA A Gutde- 
Book to Mathemattes 336 

Brousseau Brother Alfred Ftbonacet and 
Related Number Theorette Tables 338 

Browder William Surgery on Stmply-Con- 
nected Manifolds 343 

Brown Sanborn C (Ed) Changing Careers 
in Setenece and Engtneering 336 

Bruijn Nicolaas Govert de Automath A 
Language for Mathematics 838 

Brumfiel Charles F See Fleenor CR 

Bruns Carl M Algebra An Introductton 
for College Students 220 

Brunschvicg L Les Etapes de La Phtlo- 
sophte Mathémattque 339 

Brush Stephen G (Ed) Resources for the 
History of Phystes 832 

Brush Stephen G King AL Htstory tin the 
Teaching of Phystes 832 

Bruter CP (Ed) Leeture Notes tn Mathe- 
mattes-211 100 

Bucur I Lecture Notes tn Mathemattcs- 
274 222 

Bucur Ion Deleanu A Introductton to the 
Theory of Categortes and Funetors 337 

Budden FJ The Fasetnation of Groups 102 

Bunge Mario (Ed) Problems tn the Founda- 
tions of Phystes 113 

Burckel RB Charactertzattons of C(X) A- 
mong Its Subalgebras 461 

Burdette AC An Introductton to Analy- 
tie Geometry and Calculus Revised 


INDEX TO VOLUME 80, 1973 


[December 


Edttion 460 

Burington Richard Stevens Handbook of 
Mathematical Tables and Formulas 
Fifth Edition 219 

Burke CJ See Levine G 

Burstein Samuel Z See Lax Peter D 

Burton David M Abstract and Linear 
Algebra 223 

Bush Grace A Young JE Foundattons of 
Mathematies Second Edttton Wtth Ap- 
plteatton to the Soctal and Manage- 
ment Setences 1155 

Butts Thomas Problem Solving tn Mathe- 
mattes Elementary Number Theory and 
Artthmette 709 

Butzer PL Kahane J-P Szokelfalvi-Nagy B 
Linear Operators and Approximatton 
964 

Byrne George D Hall CA (Ed) Numerical 
Solution of Systems of Nonlinear 
Algebrate Equattons 1159 

Byron Jr Frederick W Fuller RW Mathe- 
mattes of Classical and Quantum 
Phystes 584 

Cain Rolene B Elementary Stattsttcal 
Coneepts 228 

Callahan John Sternberg S Weiss E Modern 
Elementary Mathemattes A Laboratory 
Approach 221 

Cannonito FB See Boone WW 

Carico Charles C See Drooyan I 

Carter Roger W Simple Groups of Lte 
Type 459 

Cassel Don Programming Language One 582 

Cassels JWS An Introduetton to Dtophan- 
tine Approxtmatton 458 

Castonguay Charles Meantng and Extstence 
tn Mathemattes 576 

Challifour John L Generalized Funettons 
and Fourter Analysts An Introduetton 
1160 

Chambadal L Formulatre de Mathématiques 
709 

Chavel Isaac Riemanntan Symmetric 
Spaces of Rank One 716 

Chen Wai-Kai Applted Graph Theory 110 

Cheng Henry See Bishop E 

Cherkasova MP Collected Problems in 
Numerteal Methods 834 

Chirlian Paul M Introduction to FORTRAN 
IV wtth Timeshare and Bateh Opera- 
tton 838 

Chou Chin Cheng Lecture Notes in Mathe- 
mattes-326 1159 

Christian Robert R Introduction to Logte 
and Sets Second Editton 1080 

Ciampi A Classtcal Hamiltontan Linear 
Systems 577 


1973] 


Cilley Dayid M See Kidd KP 

Cissell Helen See Cissell R 

Cissell Robert Cissel H Mathemattes of 
Finanee Fourth Editton 336 

Clark Colin The Theorettcal Side of 
Caleulus 460 

Clarke Douglas A Foundattons of Analy- 
sis wtth an Introductton to Logte 
and Set Theory 221 

Coale Ansley J The Growth and Strue- 
ture of Human Populations A Mathe- 
matteal Investtgatton 582 

Cochran James Alan The Analysts of Li- 
near Integral Equattons 578 

Cockcroft WH Complex Numbers A Study 
in Algebrate Structure 459 

Cohen MM A Course tn Stmple-Homotopy 
Theory 1085 

Coles William J Reed KD Tucker DH Cal- 
culus A Prelimtnary Edition 962 

Colombo S Lavoine J Transformations de 
Laplace et de Mellin 107 

Colwell Peter Mathews JC Introduction 
to Complex Vartables 1159 

Combés Michel Fondements des Mathéma- 
tiques 99 

Comrie LJ Chambers Sho.-ter Stx-Figure 
Mathematical Tables 456 

Conover SJ Practteal Nonparametric Sta- 
tisttes 581 

Constantinescu Corneliu Cornea A Poten- 
ttal Theory on Harmonte Spaces 1161 

Conte SD deBoor C Elementary Numerteal 
Analysts An Algorithmic Approach 
Second Editton 226 

Cooley William W Lohnes PR Multivari- 
ate Data Analysts 581 

Copi Irving M Beard RM (Ed) Essays on 
Wiettgenstetn's Tractatus 832 

Corlett PN Tinsley JD Practteal Pro- 
gramning Second Edttton 344 

Cornea Aurel See Constantinescu C 

Cortez Marion J See Ohmer MM 

Cournot Augustin Researches Into the 
Mathematical Prinetples of the 
Theory of Wealth 465 

Cox DR The Analysts of Binary Data 718 

Coxeter HSM Moser WOJ Generators and 
Relations for Discrete Groups Third 
Editton 337 

Craggs JW Models and Measurement 219 

Crogsley JN What ts Mathemattcal Logie? 
338 

Crouch Ralph B Beckman DN The Struc- 
ture of Abstract Algebra 713 

Crowdis David G Wheeler BW Intermediate 
Algebra for Colleges 220 


Crowell Richard H See Williamson RE 


REVIEWS INDEX 


1201 


CUPM Proceedings Summer Conference for 
College Teachers on Applted Mathe- 
mattes 574 

Curry Haskel B Hindley JR Seldin JP Com- 
btnatory Logte V II 458 

Curtain Ruth F (Ed) Lecture Notes in 
Mathematies-294 462 

Curtis Alan R Practteal Math for Bust- 
ness 829 

Curtis Jr Philip C Multivariate Caleu- 
lus with Linear Algebra 714 

Caleulus With An Introduction to 
Veetors 103 

Cutler Ann McShane R The Trachtenberg 
Speed System of Baste Mathemattes 957 

Daclin E See Perrin J-P 

Damerau Frederick J Markov Models and 
Linguistte Theory An Expertmental 
Study of a Model for English 583 

Dao-xing Xia Measure and Integratton 
Theory on Inftnite-Dimenstonal 
Spaces Abstract Harmonie Analysis 964 

Darboux Gaston Lecons sur la Théorie 
Générale Des Surfaces V I-IV 338 

Davidson Melvin PL/1 Programming with 
PL/C 837 

Davidson Ronald C Marion JB Mathematt- 
cal Preparation for General Phystes 
wrth Calteulus 959 

Davies RG Computer Programming tn Quan- 
titative Biology 582 

Davis E Allan Pedersen JJ Essentials of 
Trtgonometry Second Editton 830 

Davis Lee W Fundamental Mathemattes for 
Technical Students 958 

Davis Thomas A Algebra and Trigonometry 
97 

Algebra and Trtgonometry tn Four 
Programmed Volumes 339 

Day A Colin Fortran Techntques wtth 
Speetal Reference to Non-Numerical 
Applicattons 838 

Day Richard H Robinson SM (Ed) Mathema- 
treal Topics in Economie Theory and 
Computatton 719 

deBoor Carl See Conte SD 

deFinetti Bruno Probability Induction 
and Stattsttes The Art of Guessing 
716 

Degrazia Joseph Math ts Fun 710 

de la Harpe Pierre Lecture Notes in Ma- 
themattes-285 341 

Deleanu Aristide See Bucur I 

Dellacherie Claude Capacttés et Proces- 
sus Stochasttques 1086 

DeLuca Louis J Sedlock JT Caleulus A 
First Course 1083 

Demazure Michel Lecture Notes in 


1202 


Mathemattes-802 712 
Denouette M See Perrin J-P 
Deskins WE Abstract Algebra Fourth 
Printing 1081 
Deuring Max Lecture Notes tn Mathema- 
ttes-314 1158 
DeVore Ronald A Lecture Notes in Mathe- 
mattes-2938 962 
Dhrymes Phoebus J Distributed Lags 
Problems of Esttmation and Formu- 
tatton 111 
Dick Elie M Current Informatton Sources 
in Mathemattes An Annotated Guide to 
Books and Pertodiecals 1960-1972 574 
Dickson LE Linear Algebras 1081 
Dieudonné J Elements D'Analyse Tome III 
1160 
Elements D'Analyse Tome IV 1161 
Treatise on Analysts V IT 835 
Treattse on Analysts V IIT 1161 
Dixon Charles Applted Mathematics of 
Setence and Engineering 579 
Dock V Thomas FORTRAN IV Programming 968 
Dodes Irving Allen Finite Mathemattes A 
Liberal Arts Approach 96 
Doetsch Gustav Gutde to the Appltca- 
ttons of the Laplace and Z-Trans- 
forms 111 
Dolciani Mary P Berman SL Wooton W Mo- 
dern Algebra and Trigonometry Strue- 
ture and Method Book Two Revtsed 
Teacher's Editton 456 
Dolciani Mary P See Sorgenfrey RH 
Dold A Lectures on Algebraic Topology 
342 
Dold A Eckmann B (Ed) Lecture Notes tn 
Mathemattes-275 225 
Lecture Notes tn Mathemattcs-288 
223 
Lecture Notes tn Mathemattces-317 
1156 
Dorn William S Bitter GG Hector DL Com- 
puter Applications for Caleulus 103 
Dorn William S McCracken DD Nwnertcal 
Methods with Fortran IV Case Studtes 
226 
Dorsett Joseph L College Algebra 98 
Dorwart Harold L The Geometry of Inct- 
dence 342 
Dostal Milos A See Berenstein CA 
Dou Alberto Lectures on Parttal Differ- 
ential Equattons of First Order 104 
Douglas Ronald G Banach Algebra Techni- 
ques tn Operator Theory 460 
Dreyfuss Martin J Speaking of Math 
Prinetples of Elementary Mathema- 
ties and A Study Guide 1078 
eeond 


Drooyan Irving Hadel_ W A Progr 
troduetton to Number Systems 


INDEX TO VOLUME 80, 1973 


[December 


Edttton 457 
Trigonometry An Analytte Approach 
Second Edition 457 
Elementary Algebra Structure and 
Skills Third Edttion 828 
Dubbey JM Development of Modern Mathe- 
mattes 831 
Dubisch Roy See Howes VE 
DuChateau Paul The Cauchy-Goursat Pro- 
blem 104 
Duff Charles L See Hackert AF 
Dunford Nelson Schwartz JT Linear Opera- 
tors Part III Spectrat Operators 578 
Durbin John R Mathemattes Its Spirit and 
Evotutton 1155 
Duren Jr William L Caleulus 459 
Cateulus and Analytte Geometry 460 
Dwinger Ph Introduction to Boolean Alge- 
bras Second Revised and Enlarged Edt- 
tton 102 
Dwivedi TD See Bedford FW 
Dyer Eldon Cohomology Theortes 716 
Dym H McKean HP Fourter Sertes and In- 
tegrals 1160 
Dynkin Evgenii B Yushkevich AA Markov 
Processes Theorems and Problems 967 
Eadie WI Stattsttcal Methods in Experi- 
mental Phystes 968 
Eames WR Stanton RG Thomas RSD (Ed) Pro- 
ceedings of the Twenty-Fifth Summer 
Meeting of the Canadtan Mathematical 
Congress June 16-18 1971 709 
Earle James H Desertptive Geometry 1088 
Easton Richard J Graham Jr George P In- 
termedtate Algebra 339 
Eckhaus Wiktor Matehed Asymptotte Ex- 
panstons and Singular Perturbations 
964 
Eckmann B See Dold A 
Edelen Dominic GB Kydoniefs AD An In- 
troduction to Linear Algebra for 
Seienee and Engineertng 102 
Edwards RE Integratton and Harmonte An- 
alysts on Compact Groups 106 
Ekambaram SK The Statistical Basts of 
Quality Control Charts A Manual for 
Bustness and Factory Managers Second 
Revised Editton 718 
Eley Lothar Edmund Husserl Philosophie 
Der Artthmettk 457 
Elson Mark Concepts of Programming 
Languages 968 
Elzey Freeman F A Programmed Introduct- 
ton to Stattsties Second Edttion 967 
A First Reader in Statisttes 229 


Emch Gerard G Algebrate Methods tn Sta- 


ttstteal Mechanics and Quantum Fteld 
Theory 113 


1973] 


Emerson Lloyd S Paquette LR Linear Al- 
gebra Calteulus and Probabiltty Fun- 
damental Mathematies for the Social 
and Management Setenees 710 

Emmet ER Brain Puzzler's Deltght, Fourth 
Edttton 710 

Enderton Herbert B A Mathematical Intro- 
duetton to Logte 99 

Endler Otto Valuation Theory 458 

England AH Complex Variable Methods in 
Elasttetty 465 

Erricker BC Advanced General Statis- 
ties 718 

Essick Edward L RPG for System/360 and 
System/370 968 

Eulenberg Milton D Intermediate Alge- 
bra A College Approach 221 

Even Shimon Algorithmte Combtnatortes 
833 

Everitt WN Sleeman BD (Ed) Lecture Notes 
tn Mathemattes-280 226 

Everling W Lecture Notes tn Economics 
and Mathematical Systems-65 110 

Eves Howard W The Other Side of the 
Equation 96 

Ewald Gunter Geometry An Introduction 
227 

Eymard Pierre Lecture Notes tn Mathema- 
ttes-300 1161 

Fandel G Lecture Notes tn Economies and 
Mathematical Systems-76 964 

Fang J A Gutde to the Literature of Ma- 
thematics Today 574 

Fano Guido Mathemattcal Methods of 
Quantum Mechanics 230 

Farina Mario V See Gleim GA 

Farnsworth D (Ed) Methods of Local and 
Global Differential Geometry tn 
General Relattvity 720 

Federer Walter T Balaam LN Btbltography 
on Experiment and Treatment Design 
Pre-1968 967 

Freeman George F Brabois NR Linear Alge- 
bra and Multtvartable Caleulus 103 

Ferguson Allan (Ed) Natural Philosophy 
Through the 18th Century and Allted 
Toptes 831 

Ferrier JP Lecture Notes in Mathema- 
ties 164 1085 

Fettis Henry E An Improved Tabulation 
of the Plasma Disperston Funetton 
and Its First Dertvative 1089 

Fichtenholz GM Infinite Series Rudi- 
ments 461 

Fierz Markus Lecture Notes tn Phystics- 
15 99 

Fine Terrence L Theortes of Probability 


An Examination of Foundattons 581 


REVIEWS INDEX 


1203 


Finlayson Bruce A The Method of Wetght- 
ed Residuals and Vartattonal Prinet- 
ples with Applieatton in Flutd Mech- 
antes Heat and Mass Transfer 465 

Finney DJ An Introductton to Statistical 
Setence in Agrtculture Fourth Edt- 
tion 836 

Fisher Paul See Berston HM 

Fisher Sir Ronald A The Design of Ex- 
periments 1162 

Fitzgerald William M Laboratory Manual 
for Elementary Mathematies Second 
Editton 1080 

Flanders Harley Korfhage RR Price JJ A 
First Course tn Calculus with Analy- 
tie Geometry 834 

Introduetory College Mathemattes 
with Linear Algebra and Finite Ma- 
thematies 829 

Elementary Funetions and Analytte 
Geometry 1079 

Flaschel P Klingenberg W Lecture Notes 
tn Mathemattes-282 227 

Fleenor Charles R Shanks ME Brumfiel CF 
The Elementary Funettons Second Edti- 
tion 958 

Fobes Melcher P Elementary Functtons 
Backdrop for the Caleulus 339 

Forbes Eric G The Unpublished Writings 
of Tobtas Mayer V I-IT 831 

Forman William Gavurin LL Elements of 
Artthmette Algebra and Geometry 220 

Fossum Robert M The Divisor Class Group 
of a Krull Domatn 1159 

Fowler RH The Elementary Differenttal 
Geometry of Plane Curves 965 

Fowles Grant R Analyttcal Mechantcs 
Second Edition 112 

Fraenkel Abraham A Bar-Hillel Y Levy A 
Foundations of Set Theory Second Re- 
vised Editton 576 

Fraissé Roland Cours de Logtque Mathéma- 
tique Tome I 222 

Fraleigh John B Caleulus A Linear Ap- 
proach V II 103 

Fraleigh John B See Beauregard RA 

Frank Jr Charles R Statistics and Eco- 
nometrtes 717 

Frank Thomas S Smith JF Modern Caleulus 
1082 

Freedman David Approxtmating Countable 
Markov Chains 108 

Freiberger Walter (Ed) Stattstical Com- 
puter Performance Evaluatton 463 

Freund John E Introduction to Probabil- 
tty 836 

Friedman Ayner Differenttal Games 1160 

Friend J Newton Numbers Fun and Facts 97 


1204 


Frisk Peter D See Gustafson RD 

Fryer KD See Berman G 

Fuchs Lasz1é Infinite Abelian Groups 
V II 576 

Fuglede Bent Lecture Notes tn Mathe- 
mattes 289 715 

Fukunaga Keinosuke Introduction to Sta- 
tistteal Pattern Recognition 968 

Fuller Gordon Analytic Geometry Fourth 
Editton 829 

Fuller Robert W See Byron Jr FW 

Gagen Terrence Hale Jr MP Shult EE (Ed) 
Fintte Groups '72 960 

Gallagher RH Yamada Y Oden JT (Ed) Re- 
eent Advances tn Matrix Methods of 
Structural Analysts and Destgn 583 

Gambill Robert See Shanks ME 

Garabedian P See Bauer F 

Gardner Constance Moore See May KO 

Gardner KL Glenn JA Renton AIG (Ed) 
Children Using Mathemattes 830 

Gardner Robert B Lectures on Exterior 
Algebras over Commutative Rings 224 

Garfinkel Robert S Nemhauser GL Inte- 
ger Programming 1084 

Garnett John Lecture Notes tn Mathema- 
ttes-297 1084 

Gavurin Lester L See Forman W 

Geach PT Logie Matters 338 

Gear CW Introduction to Computer Sei- 
ence 838 

Gemignani Michael Calculus A Short 
Course 714 

Elementary Topology Second Edi- 

tion 966 

Geoffrion AM (Ed) Perspectives on Op- 
timization A Collection of Exposi- 
tory Articles 579 

Germain Clarence B PL/1 For the IBM 360 
230 

Gessner P Spremann K Lecture Notes tn 
Economics and Mathematical Systems- 
64 106 

Chartrand Gary See Behzad M 

Giacaglia GEO Perturbation Methods in 
Non-Linear Systems 715 

Gihman II Skorohod AV Stochastic Dif- 
ferential Equattons 967 

Gillespie RP Solving Problems tn Ad- 
vaneed Cateulus I 961 

Gilmer Robert Multtplicative Ideal 
theory 223 

Gitler S See Bott R 

Glaeser George Mathémattques pour 
L'éléve professeur 830 

Gleim George A Farina Mario V Data 
Processing Mathematics 344 


Glenn JA See Gardner KL 


INDEX TO VOLUME 80, 1973 


[December 


Glenn William H See Johnson DA 

Glicksman Abraham M See Ruderman HD 

Gluss Brian An Elementary Introductton 
to Dynamite Progranning A State Equa- 
tton Approach 110 

Gobran Alfonse Algebra A Course for 
College Students 829 

Godement Roger Jacquet H Lecture Notes 
tn Mathematics-260 101 

Gohring Kenneth W Fifth Annual Stmula- 
tton Sympostum Progress in Simula- 
tton V 2 1089 

Goldberg Jack L Schwartz AJ Systems of 
Ordinary Differenttal Equations An 
Introduetton 105 

Goldberg VV See Akivis MA 

Goldfeld Stephen M Quandt RE Nonlinear 
Methods in Econometrics 1087 

Goldstine Herman H The Computer From 
Paseal to von Neumann 718 

Gonzalez Richard F McMillan Jr C Machine 
Computation An Algortthmic Approach 
345 

Goodman AW The Mainstream of Algebra and 
Trigonometry 829 

Gordon Robert Robson JC Krull Dimenston 
1158 

Gould Henry W Combinatorial Identities 
100 

Grabois Neil R See Feeman GF 

Graham Jr George P See Easton RJ 

Graves Robert L See Telser LG 

Grawoig Dennis See Hughes A 

Gray Andrew Lord Kelvin An Account of 
His Setentifie Life and Work 831 

Gray HL Schucany WR The Generalized 
Jaekknife Statistie 836 

Gray Mary W Caleulus wtth Finite Mathe- 
mattes for Soetal Setences 225 

Greeno James G See Restle F 

Greenspan Harvey P Benney DJ Calculus 
An Introduction to Applied Mathema- 
ties 961 

Gregory Robert Todd See Young DM 

Greub Werner Halperin S Vanstone R Con- 
nections Curvature and Cohomology 
V I DeRham Cohomology of Mantfolds 
and Vector Bundles 107 and 966 

Greyille TNE (Ed) Population Dynamics 
582 

Griswold Ralph E The Macro Implementa- 
tton of SNOBOL 4 A Case Study of 
Machtne-Independent Software Develop- 
ment 463 

Grize Jean-Blaise Logique Moderne 
Faseteule II 99 

Grossman Michael Katz R Non-Newtontan 


Caleulus 580 


1973] 


Grosswald Emil See Rademacher H 
Grothendieck A See Artin M 
Gulick Denny Lipsman RL (Ed) Lecture 
Notes tn Mathemattes-266 106 
Gupta Somesh Das See Anderson TW 
Gustafson R David Frisk PD Elementary 
Plane Geometry 461 
Hackert Adelbert F Duff CL Elements of 
Trigonometry 97 
Hadel Walter See Drooyan I 
Hadley G Kemp MC Finite Mathemattes tn 
Business and Economies 220 
Hagihara Yusuke Celesttal Mechantes 
Perturbatton Theory V IT 720 
Hajek 0 (Ed) Lecture Notes in Mathema- 
ttes-235 1161 
Hajek Petr See Vofenka P 
Hakim Monique Topos annelés et schémas 
relatifs 961 
Halacy Dan Charles Babbage Father of 
the Computer 99 
Hale Jr Mark P See Gagen T 
Hall Charles A See Byrne GD 
Hall FM An Introduetton to Abstract 
Algebra V 1 Second Edition 223 
Hall James E Algebra A Precalculus 
Course 220 
Analytte Geometry 958 
Trigonometry Circular Funettons 
and Their Applteattons 829 
Hall Jr Marshall See Birkhoff G 
Hall Richard S About Mathemattes 827 
Halperin Stephen See Greub W 
Hamburg Morris Statistteal Analysts for 
Deetston Making 228 
Hamilton Hugh J A Prtmer of Complex 
Variables wtth An Introduction to 
Advaneed Teehntques Second Print- 
tng 225 
Hansen Rodney T Caleulus It's the 
Limtt 103 
Happ HH (Ed) Gabrtel Kron and Systems 
Theory 1087 
Harary Frank (Ed) New Direettons in the 
Theory of Graphs 833 
Hardy F Lane Essentials of Precaleulus 
Mathemattes 959 
Hardy GH Collected Papers of GH Hardy 
V V 457 
The Integratton of Funetions of a 
Stygle Vartable Second Edition 834 
Orders of Infinity 834 
Harkema R Simultaneous Equattens A 
Bayestan Approach 837 
Harnett Donald L Introduction to Sta- 
tistteal Methods and Soluttons Man- 
ual 109 


REVIEWS INDEX 


1205 


Harris Jr William A Sibuya Y (Ed) Lect- 
ure Notes tn Mathemattes-312 834 

Hart William L Baste College Algebra 98 

Hartkopf Roy Math Without Tears Second 
Printing 828 

Hartnett William E Prinetples of Modern 
Mathemattes Book 2 340 

Harvey ER See Ashley JP 

Hashisaki Joseph See Peterson JA 

Haskell Richard E Introducetton to Vec- 
tors and Cartestan Tensors A Pro- 
grammed Text for Students of Setence 
and Engtneering 458 

Haupt Floyd E See Peterson JM 

Hawkes Nigel The Computer Revolutton 345 

Haycocks HW See Benjamin B 

Healey James Jones M Mathematics for 
Profit A Business Mathemattes Text 
219 

Heaps HS An Introduction to Computer 
Languages 345 

Hector David L See Dorn WS 

Heimer Ralph T Baste Computer Coneepts 
A Self-Instruettonal Approach 230 

Heineman E Richard College Algebra 574 

Heinmets F (Ed) Concepts and Models of 
Biomathemattes Simulation Techniques 
and Methods 111 

Henderson Kenneth B Usiskin Z Zaring WM 
Precaleulus Mathemattes 220 

Herbrand Jacques Jacques Herbrand Logt- 
eal Writings 832 

Hermann Armin The Genesis of Quantum 
Theory (1899-1913) 1089 

Hermann Robert Vector Bundles tn Mathe- 
matteal Phystes V IT 113 

Hermes Hans Introduetton to Mathematical 
Logte 832 

Hershey Daniel Transport Analysis 1087 

Herskowitz Gerald J Schilling RB (Ed) 
Semteonduetor Device Modeling for 
Computer-Atded Destgn 719 

Hesse Otto Ludwtg Otto Hesse's Gesammelte 
Werke 1157 

Higgins Jon L Mathematics Teaching and 
Learning 830 

Higgins Philip J Notes on Categories and 
Groupotds 337 

Hille Einar Methods tn Classical and 
Funettonal Analysts 578 

Hilton Peter J Category Theory 1081 

Hinderer K Grundbegrtffe der Wahrschein- 
Ltchkettstheorte 461 

Hindley JR Lercher B Seldin JP Introdue- 
tton to Combtnatory Logte 457 

Hindley J Roger See Curry HB 

Hines William W Montgomery DC Probabtltty 
and Statistics tn Engineering and 


1206 


Management Setenece 836 

Hironaka H Mumford D (Ed) Osear Zartskt 
Collected Papers V I 224 

Hirsch Seymour C Essentials of Fortran 
IV 837 

Hirst KE Caleulus of One Vartable 1083 

Hocquemiller J See Weil J 

Hoel Paul G Jessen RJ Baste Statistics 
for Business and Economics 344 

Hoggatt Jr Werner E See Bicknell M 

Hohn Franz E Elementary Matrix Algebra 
Third Editton 960 

Holden Alan Shapes Space and Symmetry 
342 

Hollister Herbert A Modern Algebra A 
First Course 102 

Holz Jean L See Peterson WW 

Houzel C (Ed) Lecture Notes in Mathe- 
mattes-277 226 

Howes Vernon E Dubisch R Self-Teaching 
Intermedtate Algebra Second Editton 
828 

Howson AG A Handbook of Terms Used in 
Algebra and Analysis 96 

Hu TC Robinson SM (Ed) Mathematteal 
Programming 964 

Hu TC Integer Programming and Network 
Flows 341 

Huey RM See Karbowiak AE 

Hughes Ann Grawoig D Stattsttes A Foun- 
dation for Analysis 967 

Hughes DR Piper FC Projeetive Planes 965 

Humphreys JE Introduetton to Lte Alge- 
bras and Representation Theory 576 

Iglewicz Boris Stoyle J An Introduetton 
to Mathematteal Reasoning 339 

Illusie Luc Lecture Notes in Mathema- 
ttes-283 224 

Ingham AE The Distributton of Prime 
Numbers 833 

Ishihara Shigeru See Yano K 

Iversen Birger Lecture Notes in Mathe- 
mattes-3810 1158 

Jacquet Hervé Lecture Notes in Mathe- 
mattes-278 101 

Jacquet Hervé See Godement R 

Jain SK See Bhattacharya PB 

James IM See Bott R 

Janusz Gerald J Algebrate Number Fields 
1157 

Jardine Nicholas Sibson R Mathematical 
Taxonomy 464 

Jenner WE Lectures on Non-Assoctattive 
Algebras 712 

Jensen Finn V See Arndt 0 

Jessen Raymond J See Hoel PG 

Jewett John Phelps CR Undergraduate 


Educatton tn the Mathematical 


INDEX TO VOLUME 80, 1973 


[December 


Setenees 1970-71 574 

Johnson David E Johnson JR Graph Theory 
with Engtneertng Applications 111 

Johnson Donovan A Glenn WH Exploring 
Mathemattes on Your Qwn 709 

Johnson Johnny R See Johnson DE 

Johnson Norman L Kotz S$ Dtstributtons in 
Stattsttes Continuous Multtvartate 
Distributions 717 

Johnson Phillip E A History of Set 
Theory 831 

Jones Mark See Healey J 

Jordan Karoly Chapters on the Classical 
Caleulus of Probability 108 

Kahane J-P See Butzer PL 

Kan DM See Bousfield AK 

Kaplansky Irving Fields and Rings Second 
Editton 224 

Kapur JN The Fasetnating World of Mathe- 
mattes 336 

Thoughts on the Nature of Mathema- 
ties 828 
Thoughts on Mathematical Education 

830 

Karbowiak AE Huey RM Information Compu- 
ters Machines and Man 582 

Karras U Cutting and Pasting of Mantfolds 
SK-Groups 1162 

Katz Robert See Grossman M 

Kaufman Kenneth See Marano J 

Kawata Tatsuo Fourter Analysts in Prob- 
abtlity Theory 716 

Keedy Marvin L Nelson CW Geometry A Mo- 
dern Introductton Second Edition 965 

Kegel Otto H Wehrfritz BAF Locally Fin- 
tte Groups 961 

Keilis-Borok VI (Ed) Computattonal Sets- 
mology 1088 

Kelly GM Laplaza M Lewis G MacLane S 
Leeture Notes in Mathemattes-281 224 

Kemeny John G Snell JL Mathematical 
Models in the Soetal Setenees 230 

Kemp MC See Hadley G 

Kendall MG See Pearson ES 

Kenyon Hewitt Morse AP Web Derivatives 
1159 

Keros John W Computers Fortran IV and 
Data Processing Applteattons 229 

Kidd Kenneth P Myers SS Cilley DM The 
Laboratory Approach to Mathemattes 
576 

Kiendl H Lecture Notes tn Economies and 
Mathematteal Systems-73 964 

King Allen L See Brush SG 

Kingman JFC Regenerative Phenomena 580 

Kirk Roger E (Ed) Stattstteal Issues A 
Reader for the Behavioral Setences 
109 


1973] 


Klauder John R (Ed) Magte Without Magie 
John Archtbald Wheeler 720 
Klein Erwin Mathematical Methods in 
Theoretical Economies Topotogteal 
and Veetor Space Foundations of 
Equilibrium Analysts 579 
Kleisli Heinrich Resolutions tn Addt- 
tive and Non-Additive Categortes 1158 
Kline Morris Why Johnny Can't Add The 
Failure of the New Math 575 
Mathematteal Thought from Anctent 
to Modern Times 831 
Klingenberg W See Flaschel P 
Klinger William R Wright RR Baste Alge- 
bra 457 
Knops RJ (Ed) Lecture Notes in Mathema- 
ttes-316 963 
Knutson Donald Lecture Notes tn Mathema- 
ties-308 1081 
Kobayashi Shoshichi Nomizu K Founda- 
tions of Differential Geometry V II 
227 
Transformation Groups tn Differ- 
enttal Geometry 1161 
Kochendorffer R Determtnanten und 
Matrtzen 459 
Kock A Wraith GC Elementary Toposes 227 
Kogbetliantz Ervand Krikorian A Hand- 
book of First Complex Prime Numbers 
223 
Kohn Joseph J Differenttal Complexes 226 
Kolchin ER Differenttal Algebra and 
Algebrate Groups 960 
Kolman Bernard:See Belinfante JGF 
Komkov Vadim Lecture Notes in Mathema- 
ties-253 110 
Koosis Donald J Business Statistics 716 
Probabtltty 1086 
Korfhage Robert R See Flanders H 
Korn D See Bauer F 
Kornai Janos Antt-Equtltbrtum On Eco- 
nomie Systems Theory and the Tasks 
of Research 465 
Kotler Philip Marketing Dectston Making 
A Model Building Approach 583 
Kotz Samuel See Johnson NL 
Kraus David H Zunde P Slamecka V Na- 
ttonal Setenee Informatton Systems 
A Gutde to Setenee Informatton Sy- 
stems tn Bulgarta Czechoslovakta 
Hungary Poland Romania and Yugo- 
slavtq 97 
Krikorian Alice See Kogbetliantz E 
Krulik Stephen A Mathemattes Labora- 
tory Handbook for Secondary Schools 
1156 
A Handbook of Atds for Teaching 


Juntor-Sentor High School Math 1156 


REVIEWS INDEX 


1207 


Kshirsagar Anant M Multtvartate Analy- 
sts 717 

Ku HT (Ed) Lecture Notes tn Mathemattcs- 
298, 299 580 

Kubota Tomio Elementary Theory of 
Etsenstein Sertes 1157 

Kumpera Antonio Spencer Donald Lte 
Equattons V I General Theory 107 

Kuratowski Kazimierz Introduction to 
Set Theory and Topology Second Edi- 
tton 966 

Kurepa DR (Ed) Topology and its Appli- 
eattons 343 

Kuznetsov Boris Einstetn and Dostoyevsky 
957 

Kydoniefs Anastasios D See Edelen DGB 

Lam TY The Algebrate Theory of Quadratte 
Forms 712 

Landkof NS Foundattons of Modern Poten- 
ttal Theory 715 

Lane Bennie R Programmed Gutde to Ae- 
company Fintte Mathemattes 96 

Lang Serge Introduction to Algebrate 
and Abelian Funettons 101 

Introductton to Algebrate Geometry 

101 

Langbehn George J Lathrop TG Martini CJ 
Fundamental Coneepts of Mathematies 
220 

Lanning George E See Russell DS 

Laplaza M See Kelly GM 

Larsen Ronald An Introduetion to the 
Theory of Multtplters 578 

Larson Harold J Introduction to the 
Theory of Statistics 229 

Lathrop Thomas G See Langbehn GJ 

Lavoine J See Colombo §S 

Lawrence J Dennis A Catalog of Spectal 
Plane Curves 219 

Lawson Jr Harold W See Neuhold EJ 

Lax Anneli See Lax PD 

Lax Peter D Burstein SZ Lax A Caleulus 
with Applteattons and Computing V I 
224 

Leathem JG Volume and Surface Integrals 
Used tn Phystes 962 

Lebedev NN Spectal Functions and Their 
Appltcations 965 

Leblanc Hugues (Ed) Truth Syntax and 
Modality 833 

Lebowitz Aaron See Rauch HE 

LeCam Lucien M Neyman J Scott EL (Ed) 
Proceedings of the Stxth Berkeley 
Symposium on Mathematteal Statisttes 
and Probabtlity V I-II Y¥-VI 344 

Proceedings of the Stxth Berkeley 


Sympostum on Mathematical Statistics 
and Probabtlity V III-IV 463 


1208 


Leclerc Bruno Cahiers Mathémattques IV 
Distrtbutions statistiques et lots 
de probabilité 967 

Ledbetter David A Intermediate Algebra 
1156 

Ledin Jr George See Louden RK 

Leigh Jr Egbert Giles Adaptatton and 
Diverstty Natural History and the 
Mathematies of Evolution 111 

Leithold Louis The Caleulus with Analy- 
tie Geometry Second Edttton Part II 
104 

Lentin André Equations Dans Les Monotdes 
Libres 1082 

Lentner Marvin Elementary Applied Sta- 
tisttes 228 

Leonard JM Stattsttes The Arithmetic of 
Deetston-Making 1086 

Lercher B See Hindley JR 

Leslie John F Whitworth LL Core Mathe- 
mattes 958 

L'Esperance Wilford L Modern Stattsttics 
for Bustness and Economies 228 

Levine Gustav Burke CJ Mathemattcal 
Model Techniques for Learning 
Theortes 1089 

Levy Azriel See Fraenkel AA 

Lewis G See Kelly GM 

Lewis Peter AW (Ed) Stoehastie Point 

. Processes Statistteal Analysts 

Theory and Applteattons 109 

Lial Margaret L Miller CD College Alge- 
bra 958 

Beginning Algebra 828 

Lick DR See Alavi Y 

Lieberstein H Melvin Theory of Partial 
Differenttal Equattons 340 

Lindgren Kenneth E See Wright DF 

Linsky Leonard (Ed) Reference and 
Modality 338 

Lipsman Ronald L See Gulick D 

Locke Flora M Math Shorteuts 220 

Loebl Ernest M (Ed) Group Theory and 
Its Applicattons V II 112 

Loeckx J Leeture Notes tn Economies and 
Mathematical Systems-68 718 

Lohman Robert H Intuttive Calculus 
wtth College Algebra 1082 

Lohnes Paul R See Cooley WW 

Long Calvin T Elementary Introduction 
to Number Theory Second Edttton 101 

Lootsma FA (Ed) Numerteal Methods for 
Non-Linear Optimization 1160 

Lorch ER Precalculus Fundamentals of 
Mathematteal Analysis 575 

Lorenzen Paul Differenttal and Inte- 
gral A Constructive Introduction 


to Classteal Analysts 221 


INDEX TO VOLUME 80, 1973 


[December 


Louden Robert K Ledin Jr G Programmtng 
the IBM 11380 Second Edttion 463 

Lowe PG Classteal Theory of Structures 
Based on the Differential Equatton 111 

Lucas JR The Coneept of Probability 108 

Luckhardt Horst Lecture Notes in Mathe- 
mattes-306 833 

Lusin Nicolas Les Ensembles Analyttques 
et leurs Appltcattons 104 

Luxemburg WAJ Robinson A (Ed) Contri- 
buttons to Non-Standard Analysts 99 

Lyapunov AA (Ed) Systems Theory Re- 
search (Problemy Kibernetikt) V 21110 

Lynch Ransom V Caleulus wtth Computer 
Applteations 1082 

Lyndon RC See Boone WW 

Mackie RK Mathematteal Methods for 
Chemists 1087 

MacLane S See Kelly GM 

Mahler Kurt Introductton to p-adte 
Numbers and Thetr Funettons 711 

Mahoney Michael Sean The Mathematical 
Career of Pierre de Fermat (1601- 
1665) 830 

Mal'cev AI Algortthms and Recurstve 
Funettons 110 

The Metamathematies of Algebraic 
Systems Collected Papers 1936-1967 99 

Manifold George C Caleulating with 
Fortran 345 

Mann Richard A A FORTRAN IV Primer 463 

Mann W Robert See Taylor Angus E 

Manougian Manoug N Northcutt RA Ordt- 
nary Differenttal Equations An 
Introduction 963 

Manougian MN See Ratti JS 

Mansfield Ralph Trtgonometry wtth Ap- 
plteattions 98 

Marano Joseph Kaufman K Fundamentals 
of Mathemattes 828 

Marascuilo Leonard A Stattsttcal Me- 
thods for Behavioral Setence Re- 
search 581 

Marcus Marvin Minc H College Algebra 221 

Integrated Analytte Geometry and 

Algebra with Cireular Funettons 959 

Marder L Caleulus of Several Vartables 
1083 

Veetor Ftelds 1083 

Marion Jerry B Davidson RC Mathematical 
Preparation for General Phystes 830 

Marion Jerry B See Dayidson RC 

Marshall Clifford W Applied Graph 
Theory 100 

Martin BR Stattsttes for Physictsts 716 

Martin Hedley G Mathematics for Engi- 
neertng Technology and Computing 
Setenee 580 


1973] 


Martini Carl J See Langbehn GJ 

Maruyama G Prokhorov YuV (Ed) Lecture 
Notes tn Mathemattes-330 1162 

Mather Kenneth Stattsttcal Analysts in 
Btology 834 

Mathews Jerold C See Colwell P 

Matlis Eben Zorston-Free Modules 459 

Leeture Notes tn Mathemattes-327 
' 1158 

Maurin Krzysztof Caleulus of Varia- 
ttons and Classteal Fleld Theory 
Part I 106 

Maxfield John E Maxfield MW Keys to 
Mathemattes 1078 

Maxfield Margaret W See Maxfield JE 

Maxwell Lee M Reed MB The Theory of 


Graphs A Basts for Network Theory 101 


May Francis B Introductton to Games of 
Strategy 226 

May JP Lecture Notes tn Mathematies-271 
107 

May Kenneth 0 Btbliography and Research 
Manual of the Htstory of Mathema- 
ttes 832 

The Mathematical Assoetatton of 

Amertea Its First rtfty Years 1157 

May Kenneth O Gardner CM (Ed) World 
Directory of Htstorians of Mathema- 
ties First Edition 339 

May W Graham Linear Algebra 712 

Mayeda Wataru Graph Theory 711 

McCracken Daniel D See Dorn WS 

McCullough Thomas Phillips K Founda- 
tions of Analysts tn the Complex 
Plane 1084 

McElroy Elam E Applied Business Stati- 
sttes An Elementary Approach 462 

McKean HP See Dym H 

McMillan Jr Claude See Gonzalez RF 

McNeary Samuel S Introduetton to Com- 
putattonal Methods for Students of 
Caleulus 962 

McShane Rudolph See Cutler A 

Meetham AR (Ed) Eneyclopaedta of Line 
gutsttes Information and Control 112 

Mellor DH The Matter of Chanee 109 

Meltzer Bernard Michie D (Ed) Machine 
Intelligence 7 582 

Mendelson Elliott Number Systems and 
the Foundations of Analysts 711 

Menges Gunter Inference and Deciston 
1162 

Merriman Gaylord M Sterrett A Matrices 
and Linear Systems A Programmed In- 
troductton 712 

Meserye Bruce E Sobel MA Introduction 
to Mathematics Third Edition 1155 


Messing William Lecture Notes in 


REVIEWS INDEX 


1209 


Meyer Paul-André Lecture Notes in Ma- 
themattes-284 343 

Meyer Yves Algebrate Numbers and Harmo- 
nte Analysts 101 

Michie Donald See Meltzer B 

Miles John W Integral Transforms tn Ap- 
plied Mathematies 1161 

Miller Charles D See Lial ML 

Miller Ronald E Modern Mathematical 
Methods for Economies and Bustness 
1084 

Miller Jr Willard Symmetry Groups and 
Their Appliecattons 584 

Minc Henryk See Marcus M 

Mizrahi Abe Sullivan M Finite Mathema- 
ttes with Applicattons for Bustness 
and Soetal Setences 957 

Mohler RR Ruberti A (Ed) Theory and Ap- 
plteations of Vartable Strueture 
Systems 582 

Moineau J-C Mathématique de l'esthéti- 
que 96 

Moise Edwin E Caleulus Second Edttton 103 

Elements of Caleulus Second Edt- 

tion 103 

Molk Jules See Tannery J 

Montgomery Douglas C See Hines WW 

Montias Henri Descartes 98 

Moon Parry The Abacus Its history tts 
destgn its possibilities in the mo- 
der world 1155 

Moon Robert G Applted Mathemattes for 
Teehnteal Programs Arithmette and 
Geometry 958 

Moore Carolyn C Why Don't We Do Some- 
thing Different? 830 

Moore Hal G Pre-Caleulus Mathematics 959 

Moran Jr M Marcus Aubuchon III WE Ap- 
plted Bustness Mathemattes 221 

Morand Max Géométrie Spinortelle 960 

Morgan Bryan Men and Discovertes tn Ma- 
themattes 831 

Morse AP See Kenyon H 

Moser JK See Siegel CL 

Moser WOJ See Coxeter HSM 

Mosteller Frederick See Tanur JM 

Mulaik Stanely A The Foundattons of 
Factor Analysts 229 

Mullin RC Reid KB Roselle DP (Ed) Pro- 
eeedings of the Loutstana Conference 
on Combtnatortes Graph Theory and 
Computing 100 

Mullins Jr ER Rosen D Calculus Concepts 
1082 

Mumford D See Hironaka H 

Munem Mustafa A Tschirhart W Intermedi- 
ate Algebra 220 

Murdoch J Barnes JA Statisttes Problems 


1210 


and Soluttons 1162 
Muroga Saburo Threshold Logte and Its 
Applteattons 720 
Murray W (Ed) Nwnertecal Methods for Un- 
constrained Optimization 1085 
Murrill Paul W Smith CL Introductton 
to Computer Setence 837 
Baste Programming 230 
Myers Shirley S See Kidd KP 
Naiman Arnold Rosenfeld R Zirkel G Un- 
derstanding Stattstics 228 
Narasimhan Raghaven Several Complex 
Vartables 577 
Narici Lawrence Beckenstein E Bachman G 
Funetional Analysis and Valuation 
Theory 226 
Nayfeh Ali Hasan Perturbation Methods 
1088 
NCIM The Teaching of Secondary School 
Mathemattes-33rd Yearbook 710 
Instructional Aids tn Mathematies 
34th Yearbook 959 
The Slow Learner in Mathematics 
35th Yearbook 457 
Nelson Charles W See Keedy ML 
Nemhauser George L See Garfinkel RS 
Neuhold Erich J Lawson Jr HW The PL/I 
Machine: An Introductton to Pro- 
gramming 837 
Neumann BH Lectures on Topies in the 
Theory of Infintte Groups 960 
Newton Sir Isaac Mathematical Prtn- 
etples of Natural Philosophy and 
Hts System of the World V I 98 
Phtlosophtae Naturalis Principia 
Mathemattea V I-II 99 
Ney PE See Athreya KB 
Neyman Jerzy See LeCam LM 
Nijkamp P Planntng of Industrial Com- 
plexes by Means of Geometrie Pro- 
gramming 341 
Nilsson Nils J Problem-Solving Methods 
tn Arttfieial Intelltgence 110 
Nomizu Katsumi See Kobayashi S$ 
Norkin SB Differential Equations of the 
Seeond Order with Retarded Argument 
105 
Northcutt Robert A See Manougian MN 
Novak J (Ed) General Topology and Its 
Relations to Modern Analysts and 
Algebra IV Proceedings of the Third 
“Prague Topological Sympostum 1971716 
Noverraz Philippe Pseudo-Convextté Con- 
vextté Polynomtale et Domaines 
d'Holomorphte en Dimenston Inft- 
ntte 964 
Oberhettinger Fritz Tables of Bessel 


Transforms 580 


INDEX TO VOLUME 80, 1973 


[December 


Odell Patrick L See Boullion TL 

Oden J Tinsley See Gallagher RH 

Ohmer Merlin M Aucoin CV Cortez MJ Ele- 
mentary Contemporary Mathematics 
Second Editton 575 

OISE K-13 Mathematics Some Non-Geome- 
trite Aspects Part II Computing 
Logte and Problem-Solving 1157 

Olive Gloria Mathematics for Ltberal 
Arts Students 827 

O'Neil Peter V Fundamental Concepts of 
Topology 343 

Onicescu Octav Prinetpes de Logtque et 
de Philosophie Mathématique 458 

Orey Steven Lecture Notes on Ltmit 
Theorems for Markov Chatn Transtttion 
Probabilittes 343 

Orlik Peter Lecture Notes tn Mathema- 
ttes 291 461 

Ostrowski A Aufgabensammlung Zur Infi- 
nittestmalreehnung Band IITA Band IIB 
225 

Paley Hiram Weichsel PM Elements of Ab- 
stract and Linear Algebra 1081 

Panchev S Random Funettons and Turbu- 
lenee 230 

Papy Nombres et Vectortel Plan Reels 341 

Paquette Laurence R See Emerson LS 

Passman Donald S Infinite Group Rings 102 

Patankar SV See Srinath LS 

Pearl Martin Matrix Theory and Fintte 
Mathemattes 712 

Pearson ES Kendall MG Studtes tin the 
Htstory of Stattstics and Probabtl- 
tty 831 

Peck Lyman C Baste Mathematies for 
Management and Eeonomtes 1078 

Pedersen Jean J See Davis EA 

Pegels C Carl BASIC A Computer Program- 
ming Language with Bustnes and Mana- 
gement Appltcattons 838 

Pennisi Louis L Elements of Ordinary 
Differential Equattons 104 

Penrose Roger Techntques of Differenttal 
Topology in Relativity 716 

Pepples Jr WD See Wheeler RE 

Percus JK Combtnatortal Methods 100 

Perrin J-P Denouette M Daclin E Swtteh- 
tng Machines, V 1 719 

Suttehing Machines, V 2 720 

Peterson John A Hashisaki J Theory of 
Arithmette Third Edttton 710 

Peterson John M Haupt FE Intermediate 
Algebra and Workbook to Aecompany 
Intermediate Algebra 828 

Peterson W Wesley Holz JL FORTRAN IV 
and the IBM 360 581 


Petrich Mario Introductton to Semigroups 


1973] 


Pfeiffer Paul E Schum DA Introduetton 
to Applied Probability 835 

Phelps C Russell See Jewett J 

Phelps Jack Elementary Mathemattes 
Theory and Practtce 219 

Phillips EG Functtons of a Complex 
Vartable with Applteattons 963 

Phillips Keith See McCullough T 

Pietsch Albrecht Nuclear Locally Con- 
vex Spaces Second Editton 579 

Pinter Charles C Set Theory 1157 

Piper FC See Hughes DR 

Pipkin AC Lectures on Vtscoelasttictty 
Theory 113 

Pittnauer Franz Lecture Notes tn Ma- 
themattes-301 963 

Pitts CGC Introduction to Metric Spaces 
966 

Plachy Jon M Brinker OL Elements of 
Algebra A Worktext Second Edttton 
828 

Plotkin BI Groups of Automorphtsms of 
Algebraie Systems 103 

Pokropp F Lecture Notes tn Economtes 
and Mathematteal Systems-74 1087 

Polya G Szego G Problems and Theorems 
tn Analysts V I 107 

Powell Alan A Williams RA (Ed) EFeono- 
metrie Studies of Maero and Mone- 
tary Relattons 719 

Powers David L Boundary Value Problems 
104 


Press S James Applied Multivariate Analy- 


sts 109 

Preuss Gerhard Allgemeine Topologte 342 

Price Justin J See Flanders H 

Proceedings Combinatorial Mathemattes 
and Its Applications 100 

Prokhorov YuV See Maruyama G 

Prouse Howard L Turner VD Prinetples 
of Mathematics 97 

Prouse Howard L See Turner VD 

Putnam Hilary Philosophy of Logte 833 

Quandt Richard E See Goldfeld SM 

Quenouille MH Rapid Statistical Caleu- 
lattons Second Edition 1162 

Quine Willard Van Orman Set Theory and 
Its Logte Revtsed Editton 832 

Raab Joseph A Audtovisual Materials in 
Mathematics 1156 

Rademacher Hans Toptes in Analytte 
Number Theory 1158 

Rademacher Hans Grosswald E Dedekind 
Sums 222 

Rado Tibor On the Problem of Plateau 
Subharmonte Fyunetions 580 

Raghunathan MS Dtserete Subgroups of 


Ltve Groups 713 


REVIEWS INDEX 


1211 


Rainville Earl D Intermediate Differen- 
tial Equations Second Edition 340 
Raj Des The Destgn of Sample Surveys 717 
Rasmussen Séren Non-Linear Semi-Groups 
Evolutton Equattons and Produetinte- 
gral Representattons 106 

Ratti JS Manougian MN Introductory Cal- 
culus wtth Appltecations 1082 

Rauch Harry E Lebowitz A Elliptic Func- 
tions Theta Funettons and Rtemann 
Surfaces 1083 

Read Ronald C A Mathematteal Background 
for Economtsts and Soctal Setenttsts 
456 

Graph Theory and Computing 222 

Redei L Lacunary Polynomials Over Fint- 
te Fields 1157 

Reed Keith D See Coles WJ 

Reed Myril B See Maxwell LM 

Reeves CM An Introduetton to Logteal 
Design of Digttal Cireutts 345 

Reid KB See Mullin RC 

Reid William H (Ed) Mathematical Pro- 
blems in the Geophystcal Setences 
V 1-2 112 

Reid William T Rtecatt Dtfferenttal 
Equattons 105 

Renton AIG See Gardner KL 

Rényi Alfréd Letters on Probability 832 

Restle Frank Greeno JG Introduetton to 
Mathematteal Psychology 584 

Reyment RA See Blackith RE 

Ribenboim Paulo Rings and Modules 340 

Richardson Leonard F See Richardson M 

Richardson Moses Richardson LF Funda- 
mentals of Mathemattes Fourth Edt- 
tton 575 

Richman Fred Number Theory An Introdue- 
tton to Algebra 102 

Richman Fred Walker C Walker E College 
Trigonometry 575 

Mathematies for the Liberal Arts 
Student Second Edition 1078 

Ritt Robert K Fourter Sertes 226 

Rivano Neantro Saavedra Lecture Notes 
tn Mathemattes-265 224 

Robert Alain Lecture Notes in Mathema- 
ttes-3826 1158 

Robertson AP Robertson W Topological 
Veetor Spaces Second Edition 835 

Robertson Wendy See Robertson AP 

Robinson A See Luxemburg WAJ 

Robinson Derek JS Fintteness Condittons 
and Generaltzed Soluble Groups 337 

Robinson Stephen M See Day RH 

Robinson Stephen M See Hu TC 

Robison J Vincent Modern Algebra and 
Irigonometry Seeond Editton 456 


1212 


Robson JC See Gordon R 
Roessler Edward B See Alder HL 


Rolewicz Stefan Metrte Linear Spaces 1161 


Rose Donald J Willoughby RA (Ed) Sparse 
Matrices and Thetr Applteattons 460 

Roselle DP See Mullin RC 

Rosen David See Mullins Jr ER 

Rosenblatt Judah I See Blum JR 

Rosenfeld Robert See Naiman A 

Ross Sheldon M Introductton to Prob~ 
ability Models 343 

Roussas George G Contigutty of Prob- 
ability Measures Some Appltcattons 
tn Stattsttes 108 

Roxin Emilio 0 Ordinary Differential 
Equattons 963 

Royal Quantities Untts and Symbols 336 

Ruberti A See Mohler RR 

Rubinoff Morris (Ed) Advances in Com- 
puters V 12 345 

Rubinowicz A Sommerfeldsehe Polynom- 
methode 460 

Ruderman Harry D Glicksman AM Mathema- 
tteal Systems An Introduetton 577 

Rudin Walter Functional Analysts 577 

Rummel RJ Applted Factor Analysts 584 

Russell Donald S Lanning GE Intermedt- 
ate Algebra Seeond Edttton 98 

Rutter Edgar A See Brewer JW 

Sagle Arthur A Walde RE Introductton 
to Lre Groups and Lie Algebras 965 

Sakai Shéichird C*-Algebras and W*- 
Algebras 578 

Salmon Wesley C Statistical Explanatton 
and Statistical Relevanee 344 

Saltz Daniel A Short Caleulus An Ap- 
plted Approach 714 

Sarma KR See Srinath LS 

Sawyer WW An Engineering Approach to 
Linear Algebra 337 

Schaaf William L The High School Mathe- 
mattes Library Fifth Edition 1156 

Schechter Martin Prinetples of Fune- 
ttonal Analysts 106 

Sspeetra of Partial Differential 

Operators 577 

Schey HM Div Grad Curl and ALL That An 
Informal Text on Vector Caleulus 1083 

Schilling Ronald B See Herskowitz GJ 

Schminke CW Arnold WR (Ed) Mathematies 
us a Verb Opttons for Teaching a 
Book of Readings 1156 

Schmitt Klaus (Ed) Delay and Funectton- 
al Differenttal Equations and Thetr 
Applteattons 340 

Schneider Hans Barker GP Matrices and 
Linear Algebra Second Edition 1081 


Schochet Claude Cobordtsm From an 


INDEX TO VOLUME 80, 1973 


[December 


Algebrate Potnt of View 227 

Schonland David S La Symétrte Molécy- 
tatre 582 

Schubert Horst Categortes 713 

Schucany WR See Gray HL 

Schum David A See Pfeiffer PE 

Schumaker John A See Weinberg GH 

Schwartz Arthur J See Goldberg JL 

Schwartz Jacob T Introduction to Ma- 
trices and Veetors 712 

Schwartz Jacob T See Dunford N 

Schwarz HA Gesamnelte Mathematische 
Abhandlungen Seeond Edttton 1080 

Schwarz HR Numerical Analysts of Sym- 
metrte Matrices 834 

Scott Elizabeth L See LeCam LM 

Searle SR Linear Models 717 

Sedlock James T See DeLuca LJ 

Seeley Robert T Caleulus of One Vari- 
able Seeond Edition 714 

Caleulus of One and Sevaral Vari- 

ables 714 

Seip Ulrich Lecture Notes in Mathema- 
ttes-278 341 

Seldin JP See Hindley JR 

Seldin Jonathan P See Curry HB 

Selfridge Oliver G A Primer for FORTRAN 
IV On-Line 464 

Semendyayev KA See Bronshtein IN 

Sengupta Jati K Stochastic Programming 
Methods and Applications 1160 

Sengupta Jati K See Tintner G 

Serre J-P A Course in Arithmetie 959 

Représentations Linéatres des 

Groupes Finis Deuxtéme édttton 224 

Sgall Petr Tesitelova M Vachek J (Ed) 
Prague Studies tn Mathematteal Lin- 
guisttes V 3 112 

Shampine Lawrence F Allen Jr RC Numeri- 
eal Computing An Introduetton 1160 

Shanks Merrill E Gambill R Caleulus 
Analytte Geometry/Elementary Fune- 
ttons 961 

Shanks Merrill E See Fleenor CR 

Shao Stephen P Stattisties for Business 
and Eeonomtes Second Edttton 462 

Sharpe DW Vamos P Injeettve Modules 337 

Shifrin Yakov Solomonovich Statistical 
Antenna Theory 1088 

Shult Ernest E See Gagen T 

Sibson Robin See Jardine N 

Sibuya Yasutaka See Harris Jr WA 

Siegel CL Moser JK Lectures on Cele- 
sttal Mechantes 105 

Silverman Eliot N Brody LA Statisties 
A Common Sense Approach 837 

Simader Christian G Lecture Notes in 


Mathematies-268 106 


1973] 


Singh Jagjit Mathematical Ideas Their 
Nature and Use 1078 

Skelton John E An Introduction to the 
BASIC Language 229 

Skorohod AV See Gihman II 

Slamecka Vladimir See Kraus DH 

Sleeman BD See Everitt WN 

Slomson AB See Bell JL 

Slook Thomas H Wurster MA Elementary 
Modern Mathematies wtth Cateulus 
and Computer Programming 710 

Sloyer Clifford W See Baxter WE 

Smart James R Modern Geometrtes 1085 

Smirnov VI (Ed) Linear Operators and 
Operator Equations 227 

Smith Cecil L See Murrill PW 

Smith James F See Frank TS 

Smith Kennan T Primer of Modern Anatly- 
sts 225 

Snell J Laurie See Kemeny JG 

Sobel Max A See Meserve BA 

Solomon Charles Mathematics 336 

Sorgenfrey Robert H Wooton W Dolciani 
MP Modern Algebra and Trtgonometry 
Structure and Method Book 2 New 
Editton Teacher's Editton 575 

Spencer Donald See Kumpera Antonio 

Spitzbart Abraham Analytic Geometry 958 

Spremann K See Gessner P 

Squire William Integratton for Engt- 
neers and Setenttsts 107 

Srinath LS Sarma KR Patankar SV Baste 
Engtneertng and Mathematical Tables 
838 

Srinivasan SK Vasudevan R Introduetton 
to Random Dtfferenttal Equations 
and Thetr Applteattons 105 

Srivastava Jagdish N (Ed) A Survey of 
Combtnatortal Theory 1157 

Standley Gerald B New Methods tn Sym- 
bolte Logte 222 

Stanton RG See Eames WR 

Starke Peter H Abstract Automata 463 

Steen SWP Mathematical Logte wtth 
Spectal Reference to the Natural 
Numbers 338 

Steenrod Norman E How to Wrtte Mathema- 
ttes 1155 

Steger Joseph A (Ed) Readings in Sta- 
ttsties for the Behavtoral Setentist 
836 

Stein Sherman K Calculus and Analytic 
Geometry 1082 ° 

Steiner Hans-Georg (Ed) The Teaching of 
Geometry at the Pre-College Level 1156 

Stenius Erik Critieql Essays 710 

Sternberg Saul Mathematics and Social 
Setences I 113 


REVIEWS INDEX 


1213 


Sternberg Shlomo See Callahan J 

Sterrett Andrew See Merriman GM 

Stewart GW Introduction to Matrix Com- 
putattons 1084 

Stewart Ian Galots Theory 833 

Stockton Doris S Essential Algebra 1079 

Essenttal Algebra wtth Funetions 
1079 
Essenttal Mathematics 98 

Stoer Josef Einfuhrung in dte Numert- 
sche Mathemattk I 460 

Stoyle Judith See Iglewicz B 

Strebe David D Elements of Modern Artth- 
mette 575 

Street Anne Penfold See Wallis WD 

Stroock Daniel W Varadhan SRS (Ed) 
Toptes in Probability Theory Semi- 
nar 1971-1972 967 

Struble Mitch Stretehtng a Point 710 

Styan George PH See Anderson TW 

Sulanke R Wintgen P Differentialgeome- 
trte und Faserbundel 966 

Sullivan Michael See Mizrahi A 

Suppes Patrick Axtomatie Set Theory 1080 

Sveshnikov AG Tikhonov AN The Theory of 
Funettons of a Complex Vartable 1084 

Swartz Clifford E Used Math for the 
First Two Years of College Setence 
456 

Sz-Nagy Béla (Ed) Hilbert Space Opera-~ 
tors and Operator Algebras 834 

Szego GP (Ed) Minimization Algortthms 
Mathematical Theories and Compute 
Results 715 

Szego G See Polya G 

Szokefalvi-Nagy B See Butzer PL 

Takeuti G Zaring WM Axtomatte Set 
Theory 1080 . 

Tannery Jules Molk Jules Eléments de la 
Théorte des Fonetions Ellipttques 
Second Editton Tome I-IV 

Tanur Judith M Mosteller F (Ed) Statis- 
ttes A Gutde to the Unknown 109 

Tarski Alfred Introduction a la Logtque 
222 

Tatsuoka Maurice M Multtvartate Analy- 
sts Techntques for Edueattonal and 
Psychologteal Research 462 

Taylor Angus E Mann WR Advanced Caleu- 
lus Second Edttton 225 

Taylor Howard E Wade TL Contemporary 
Irigonometry 829 

Taylor Joan Gary Applebaugh GN Anderson 
D Fintte Mathemattes 827 

Taylor Joseph L Measure Algebras 1159 

Teague Robert Computing Problems for 
Fortran Solution 463 

Teekens R Predietton Methods in 


1214 


Multtplteattve Models 344 

Teixeira F Gomes Traitédes courbes 
spectales remarquables planes et 
gauches Tome I-III 342 

Telling HG The Rattonal Quartte Curve 
in Space of Three and Four Dimen- 
stons 966 

Telser Lester G Graves RL Functtonal 
Analysts tn Mathematteal Feonomtes 
Optimtzatton Over Infinite Hortzons 
1087 

Templeton JGC See Bagchi TP 

Tesitelova Marie See Sgall P 

Theil Henri Prineiples of Econometrics 
464 

Stattstteal Decomposttton Analy- 

sts 584 

Thomas Ann M See Thomas JW 

Thomas Jr George B Calculus and Analy- 
tie Geometry Alternate Edttton 714 

Thomas James W Thomas AM Fintte Mathe- 
mattes 957 

Thomas John B An Introduction to Applied 
Probability and Random Processes 228 

Thomas RSD See Eames WR 

Thompson Colin J Mathemattecal Statistt- 
eal Mechantes 112 

Thompson Gerald E Linear Programming An 
Elementary Introductton 226 

Thompson Howard E Appltcattons of Cal~ 
culus tn Bustness and Economtes 1087 

Thurston Hugh The Caleulus An Introdue- 
tton 714 

Tijms HC Analysts of (s,S) Inventory 
Models 1087 

Tikhonov AN See Sveshnikov AG 

Tinsley JD See Corlett PN 

Tintner Gerhard Sengupta JK Stochastte 
Economtes Stochastte Processes Con- 
trol and Programming 583 

Topping David M Leetures on Von Neumann 
Algebras 964 

Tornig W See Ansorge R 

Tou Julius T (Ed) Advances tn Informa- 
tton Systems Setenee V 4 110 

Tougeron Jean Claude Idéaux de fonetions 
differenttables 341 

Tranter CJ Integral Transforms tin Ma- 
thematteal Phystes 583 

Treiman Sam B Lectures on Current Alge- 
bra and Its Applications 1089 

Triola Mario F Mathemattes and the Mo- 
dern World 1079 

Trotter Hale F See Williamson RE 

Tschebyscheff£ PL Theorte der Congruen- 
zen (Elemente der Zahlentheorte) 338 

Tschirhart William See Munem MA 


Tucker Don H See Coles WJ 


INDEX TO VOLUME 80, 1973 


[December 


Turner JS Buoyaney Effects tn Fluids 720 

Turner V Dean Prouse HL Introduetton to 
Mathemattes 98 

Turner VY Dean See Prouse HL 

Ullman Neil R Stattstties An Applted Ap- 
proach 228 

US Army Transaettons of the Stxteenth 
Conferenee of Army Mathemattetans 97 

Transacttons of the Seventeenth 

Conference of Army Mathematictans 709 

Usiskin Zalman See Henderson KB 

Vachek Josef See Sgall P 

Vamos P See Sharpe DW 

van der Merwe Alwyn See Yourgrau W 

Van Note Peter Tangrams Pieture-Making 
Puzzle Game 709 

Vanstone Ray See Greub W 

Varadhan SRS See Stroock DW 

Varaiya PP Notes on Opttmizattion 835 

Yasudevan R See Srinivasan SK 

Venkatarayudu T See Bhagavantam S 

Verbeek A Superextenstons of Topologt~ 
eal Spaces 1085 

Verdier JL See Artin M 

Vervaat W Suecess Epochs in Bernoullt 
Trials With Applicattons in Number 
Theory 1086 

Vick James W Homology Theory An Intro- 
duetton to Algebrate Topology 835 

Vilenkin NYa Funettonal Analysts 579 

Vofbenka Petr Hajek P The Theory of 
Semtsets 222 

Wade Thomas L See Taylor HE 

Waelbroeck Lucien (Ed) Lecture Notes in 
Mathematties-381 1160 

Walde Ralph E See Sagle AA 

Walker Carol See Richman F 

Walker Elbert See Richman F 

Wallace Philip R Mathemattcal Analysts 
of Phystcal Problems 1089 

Wallis Jennifer Seberry See Wallis WD 

Wallis Jennifer Wallis WD (Ed) Proceed-~ 
ings of the First Australtan Con- 
ference on Combtnatortal Mathema- 
ttes 458 

Wallis WD Street AP Wallis JS Lecture 
Notes tn Mathemattes-292 711 

Wallis WD See Wallis J 

Walsh T On Summabtlity Methods for Con- 
jugate Fourter-Stieltjes Integrals 
tn Several Vartables and Generaltza- 
ttons 1159 

Walter Wolfgang Gewohnliche Differen- 
ttalgletchungen 963 

Ward Jr Lewis E Topology An Outline for 
a First Course 343 

Wardle ME Computing tn Mathemattes From 
Problem to Program 345 


1973] 


Warner Garth Harmonic Analysts on Semt- 
Stmple Lie Groups I 105 
Harmonie Analysts on Semt-Stmple 
Lte Groups II 577 
Wasan MT Mathematical Probability 581 
Washington Allyn J An Introduetton to 
Caleulus with Applications 225 
Mathemattes A Developmental Ap- 
proach 339 
Watanabe Satosi (Ed) Frontters of Pat- 
tern Reeognttton 838 
Watson GN Complex Integratton and 
Cauchy's Theorem 1084 
Webber G Cuthbert Algebrate Structures 
for Teachers 959 
Wehrfritz Bertram AF See Kegel OH 
Weichsel Paul M See Paley H 
Weil J Hocquemiller J Algébre Solu- 
tions Développées des exercises 712 
Weinberg George H Schumaker JA Statts- 
ties An Intuitive Approach Seeond 
Editton 228 
Weinberg Gerald M The Psychology of 
Computer Programming 718 
Weinberger HF A First Course in Par- 
tial Differential Equattons With 
Complex Vartables and Transform 
Methods 963 
Weinstein Raoul L Precalculus Mathema- 
ties A Fundamental Approach 959 
Weiss Edwin See Callahan J 
Wells Jr RO Differential Analysts on 
Complex Manifolds 962 
Welsh DJA Woodall DR (Ed) Combinatortecs 
711 
Wheeler Brandon W See Crowdis DG 
Wheeler Ruric E Peeples Jr WD Modern 
Mathemattes for Business Students 
957 
Wheeler Ruric E Fundamental College 
Mathematies Number Systems and In- 
tuittve Geometry 1080 
Modern Mathematics An Elementary 
Approach Third Edttton 957 
White AT See Alavi Y 
Whitehead AN The Axioms of Deseript- 
tve Geometry 965 
Whitehead Jr Earl Glen Enumerative Com~ 
binatories 1971-1972 100 
Combinatortal Algortthms 833 
Whiteside DT (Ed) The Mathematical 
“Papers of Isaae Newton V 5 98 
Whitesitt J Eldon Prinetples of Mod- 
ern Algebra Second Edition 960 
Whitney Hassler Complex Analytic 
Variettres 1085 
Whitworth Larry L See Leslie JF 


Wigner Eugene P Symmetries and 


REVIEWS INDEX 


1215 


Reflecttons Setentifte Essays of 
Eugene P Wigner 827 

Wilde Daniel U An Introduetton to Com- 
puting Problem-Solving Algorithms 
and Data Structures 838 

Wilenkin NJ Unterhaltsame Mengentlehre 
1080 

Willerding Margaret F A First Course tn 
College Mathemattes and A First 
Course tn College Mathemattes A Work- 
text 958 

Williams CB Style and Vocabulary Numeri- 
eal Studtes 1088 

Williams Gareth A Course in Linear Alge- 
bra 833 

Williams IP Matrices for Setenttsts 712 

Williams J Complex Numbers 1083 

Williams K Problems in Statistics The 
Potsson and Exponential Distrtbu-~ 
ttons 344 

Williams Keith W Introduction to Col- 
lege Mathematies 828 

Williams Ralph C Mathematics for Com- 
munteatton Number Relations 1079 

Williams Ross A See Powell AA 

Williamson Richard E Crowell RH 
Trotter HF Calculus of Vector Fune- 
ttons Third Edttton 104 

Willoughby Ralph A See Rose DJ 

Wikloughby Stephen S Stattsttes and 
Probabiltty 836 

Wilson Robin J Introduetton to Graph 
Theory 711 

Wintgen P See Sulanke R 

Wisner Robert J Elements of Probabiltty 
1086 

Witter George E The Structure of Mathe- 
mattes An Introduetton 219 

Wolf Joseph A Spaces of Constant Curva- 
ture Second Edttton 342 

Wolfe Carvel S Linear Programming with 
Fortran 835 

Wolff Peter Breakthroughs tn Mathema- 
ttes 827 

Woodall DR See Welsh DJA 

Wooton William See Sorgenfrey RH 

Wooton William See Dolciani MP 

Wooton William Modern Trigonometry Re- 
vised Edttton 575 

Wraith GC See Kock A 

Wrede Robert C Introductton to Veetor 
and Tensor Analysts 1083 

Wren F Lynwood Baste Mathematteal Con- 
cepts Second Edttton 1156 

Wright D Franklin Lindgren KE Intermedi- 
ate Algebra for College Students 97 


Elementary Algebra for College 
Students 97 


1216 


Wright Rohert R See Klinger WR 

Wurster Marie A See Slook TH 

Yaglom IM Geometrte Transformattons 
IIT 580 

Yamada Y See Gallagher RH 

Yano Kentaro Ishihara S Tangent and Co- 
tangent Bundles Differenttal Geome- 
try 1161 

Yeh Rui Zong Modern Probability Theory 
1086 

Young David M Gregory RT A Survey of 
Numerteal Mathemattes 2 Vols 1159 

Young Grace Chisholm See Young WH 

Young John E See Bush GA 

Young WH Young GC The Theory of Sets 
of Potnts Second Editton 576 

Young WH The Fundamental Theorems of 
the Differenttal Caleulus 1085 

Yourgrau Wolfgang van der Merwe A (Ed) 
Perspectives in Quantum Theory Es- 
says in Honor of Alfred Landé 1088 

Yushkevich Aleksandr A See Dynkin EB 

Zacks Shelemyahu The Theory of Statts- 
tteal Inference 581 

Zagier Don Bernard Lecture Notes tn 
Mathemattes-290 461 

Zarantonello Eduardo H (Ed) Contrtbu- 
tions to Nonlinear Funettonal Analy- 
sts 579 

Zaremba SK (Ed) Appltcattons of Number 
Theory to Numerteal Analysts 105 


INDEX TO VOLUME 80, 1973 


Zarembka Paul Toward a Fheory of Eeono- 
mie Development 464 

Zaring Wilson M See Henderson KB 

Zaring WM See Takeuti G 

Zelazko W Selected Toptes tn Topologt- 
cal Algebras 106 

Banach Algebras 835 

Zelinsky Daniel A First Course tn Lt- 
near Algebra Second Edition 459 

Zellner Arnold An Introduectton to Baye- 
stan Inference in Eeonometrics 465 

Zemanian AH Realizabtltty Theory for 
Continuous Linear Systems 461 

Zierer Ernesto The Theory of Graphs in 
Lingutsttes 112 

Zimmer Horst G Lecture Notes tn Mathe- 
mattes-262 101 

Zirkel Gene See Naiman A 

Zlot William Sourcebook of Fundamental 
Mathemattes Series Artthmetic 1079 

Sourcebook of Fundamental Mathe- 

mattes Series Elementary Algebra 
1079 

Sourcebook of Fundamental Mathe- 
mattes Sertes Elementary Geometry 
1079 

Zubrzycki Stefan Lectures tn Probabtl- 
tty Theory and Mathematical Statts- 
ttes 717 

Zunde Pranas See Kraus DH 


NEWS AND NOTICES 


PERSONAL ITEMS 
114, 231, 346, 466, 585, 721, 839, 969, 1090, 1163 


GENERAL INFORMATION 


ACM George E. Forsythe Student Paper Mathematical Research and Education 586 
Competition 585 Fourteenth Biennial International Seminar of 

All-College Conference Room to honor David the Canadian Mathematical Congress 346 
W. Blakeslee 346 New Sabbatical Leave Exchange Service 840 

Conference on the Application of Undergraduate Seminar on generalized inverses and applications 
Mathematics in the Life, Managerial, at the MRC, University of Wisconsin 840 
Social and Engineering Sciences 467 USA Mathematical Olympiad 231 


Conference on the Influence of Computing on Unsolved problems in mathematics 841 


NECROLOGY 
Ayer Miriam C 969 Levy BR 1163 
Carter HC 969 Loring RJ 1090 
Curtis HB Jr. 969 Macdonald SL 1163 
Daus PH 1163 Moursund AF 467 
Dean AE 969 Price HV 1163 
Earl JM 467 Snyder AD 1090 
Edwards PD 1163 Vandiver HS 1163 
Foster JF 1163 Watt MW 1163 
Graves LM 1163 Winger RM 1163 


Lefschetz Solomon 467 


1217 


1218 


INDEX TO VOLUME 80, 1973 


REPORTS AND ANNOUNCEMENTS OF THE ASSOCIATION AND ITS SECTIONS 
MEETINGS AND ANNOUNCEMENTS OF THE ASSOCIATION 


Academic members elected into the Association 
HL ALbER 597 

Acknowledgment 1184-1185 

Announcement of Lester R. Ford Awards 
HL ALDER 849 

Announcement of W. B. Ford Lecture Fund 595 

Committee on Educational Media 347 

Disability Income Plan added to the MAA 
Group Insurance Program 844 

Employment Information for Mathematicians 
115 1090 

Fifty-fourth Summer Meeting of the Association 
HL Axper 1164-1179 

Fifty-sixth Annual Meeting of the Association 
HL ALDER 587-597 

Films produced by the MAA 849 


Honorary Life Membership for Professor 
EP Starke JosHUA BARLAZ 1181-1182 

MAA publishes Guidelines for evaluating 
college mathematics programs 841 

Mathematical Sciences Employment Register — 
Open Register 1183 

New Sectional Governors of the Association 
AB WILLCOXx 848 

Officers and Committees as of February 1, 1973 
468-473 

Proceedings of the 1971 Summer Conference 
held at the University of Missouri, Rolla 347 

The Putnam Mathematical Competition 849 

Report of the Treasurer for the year 1972 
LEONARD GILLMAN 1180 


MEETINGS OF ITS SECTIONS 


Allegheny Mountain May 1973 MR WooDARD 
1094 

Florida March 1973 FL CLEAVER 847 

Illinois May 1973 H Saar 975 

Indiana May 1972 RT Hoop 114 November 
1972 RT Hoop 844 April 1973 RT Hoop 1091 

Iowa April 1973 BE GILLam 972 

Louisiana-Mississippi February 1973 PL Forp 
845 

Maryland-District of Columbia-Virginia Novem- 
ber 1972 JM SmitH 723 April 1973 JM 
SMITH 1182 

Metropolitan New York April 1973 Rora 
Iacopaccl 1092 

Nebraska April 1973 HM Cox 972 

North Central October 1972 HM ANDERSON 
467 April 1973 HM ANDERSON 973 


Northeastern June 1973 GW Best 1183 

Northern California February 1972 NEWMAN 
FISHER 722 February 1973 NEWMAN FISHER 
846 

Ohio November 1972 RH Rotwina 586 April 
1973 RH ROoLWING 973 

Oklahoma-Arkansas April 1973 EK McLacu- 
LAN 974 

Philadelphia November 1972 AE FILANo 723 

Rocky Mountain May 1973 DJ STERLING 1096 

Seaway November 1972 EmMmMeT STOPHER 587 
May 1973 EMmMeT STOPHER 1097 

Southeastern March 1973 JD NEFF 969 

Southern California March 1973 TN ROBERTSON 
971 

Southwestern April 1973 A Swimmer 1093 

Texas April 1973 JC BRADFORD 1093 


